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SYSTEM FOR THE IN VIVO DELIVERY AND 
EXPRESSION OF HETEROLOGOUS GENES IN 

THE BONE MARROW 

FEDERALLY SPONSORED RESEARCH 
This invention was made with Government support under Grant 
Number 5 ROl AI22186 from the National Institutes of Health. The Government 
has certain rights to this invention. 

FIELD OF THE INVENTION 
The present invention relates to recombinant DNA technology, and in 
particular to introducing and expressing foreign DNA in a eukaryotic cell. 

BACKGROUND OF THE INVENTION 
The Alphavirus genus includes a variety of viruses all of which are 
members of the Togaviridae family. The alphaviruses include Eastern Equine 
Encephalitis virus (EEE), Venezuelan Equine Encephalitis virus (VEE), Everglades 
virus, Mucambo virus, Pixuna virus, Western Equine Encephalitis virus (WEE), 
Sindbis virus, South African Arbovirus No. 86 (S.A.AR 86), Girdwood S.A. 
virus, Ockelbo virus, Semliki Forest virus, Middelburg virus, Chikungunya virus, 
O'Nyong-Nyong virus, Ross River virus, Barmah Forest virus, Getah virus, 
Sagiyama virus, Bebaru virus, Mayaro virus, Una virus, Aura virus, Whataroa 
virus, Babanki virus, Kyzylagach virus, Highlands J virus, Fort Morgan virus, 
Ndumu virus, and Buggy Creek virus. 
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The alphavirus genome is a single-stranded, messenger-sense RNA, 
modified at the 5*-end with a methylated cap, and at the 3'-end with a variable- 
length poly (A) tract. The viral genome is divided into two regions: the first 
encodes the nonstructural or replicase proteins (nsPl-nsP4) and the second encodes 
the viral structural proteins. Strauss and Strauss, Microbiological Rev. 58, 491- 
562, 494 (1994). Structural subunits consisting of a single viral protein, C, 
associate with themselves and with the RNA genome in an icosahe'dral 
nucleocapsid. In the virion, the capsid is surrounded by a lipid envelope covered 
with a regular array of transmembranal protein spikes, each of which consists of 
a heterodimeric complex of two glycoproteins, El and E2. See Paredes et al., 
Proc. Natl. Acad. Sci. USA 90, 9095-99 (1993); Paredes et al., Virology 187, 324- 
32 (1993); Pedersen et al., J. Virol. 14:40 (1974). 



Sindbis virus, the prototype member of the alphavirus genus of the 
family Togaviridae, and viruses related to Sindbis are broadly distributed 
throughout Africa, Europe, Asia, the Indian subcontinent, and Australia, based on 
serological surveys of humans, domestic animals and wild birds. Kokemot et al. , 
Trans. R. Soc. Trop Med. Hyg. 59, 553-62 (1965); Redaksie, S. Afr. Med. J. 42, 
197 (1968); Adekolu- John and Fagbami, Trans. R. Soc. Trop. Med. Hyg. 77, 149- 
51 (1983); Darwish et al., Trans. R. Soc. Trop. Med. Hyg. 77, 442-45 (1983); 
LundstrSm et al., Epidemiol. Infect. 106, 567-74 (1991); Morrill et al., /. Trop. 
Med. Hyg. 94, 166-68 (1991). The first isolate of Sindbis virus (strain AR339) 
was recovered from a pool of Culex sp. mosquitoes collected in Sindbis, Egypt in 
1953 (Taylor et al., Am. J. Trop. Med. Hyg. 4, 844-62 (1955)), and is the most 
extensively studied representative of this group. Other members of the Sindbis 
group of alpbaviruses include South African Arbovirus No. 86. Ockelbo82, and 
Girdwood S.A. These viruses are not strains of the Sindbis virus; they are related 
to Sindbis AR339, but they are more closely related to each other based on 
nucleotide sequence and serological comparisons. Lundstrom et al. , J. Wildl. Dis. 
29, 189-95 (1993); Simpson et al., Virology 222, 464-69 (1996). Ockelbo82, 
S.A.AR86 and Girdwood S.A. are all associated with human disease, whereas 
Sindbis is not. The clinical symptoms of human infection with Ockelbo82, 
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S.A.AR86, or Girdwood S.A. are a febrile illness, general malaise, macropapular 
rash, and joint pain that occasionally progresses to a polyarthralgia sometimes 
lasting from a few months to a few years. 



The study of these viruses has led to the development of beneficial 
techniques for vaccinating against the alphavims diseases, and other diseases 
through the use of alphavims vectors for the introduction of foreign DNA. See 
United States Patent No. 5,185,440 to Davis et al., and PCT Publication WO 
92/10578. It is intended that all United States patent references be incorporated 
in their entirety by reference. 



It is well known that live, attenuated viral vaccines are among the 
most successful means of controlling viral disease. However, for some virus 
pathogens, immunization with a live virus strain may be either impractical or 
unsafe. One alternative strategy is the insertion of sequences encoding immunizing 
antigens of such agents into a vaccine strain of another virus. One such system 
utilizing a live VEE vector is described in United States Patent No. 5,505,947 to 
Johnston et al. 



Sindbis virus vaccines have been employed as viral carriers in virus 
constructs which express genes encoding immunizing antigens for other viruses. 
See United States Patent No. 5,217,879 to Huang et al. Huang et al. describes 
Sindbis infectious viral vectors. However, the reference does not describe the 
cDNA sequence of Girdwood S.A. and TR339, nor clones or viral vectors 
produced therefrom. 



Another such system is described by Hahn et al., Proc. Natl. Acad. 
ScL USA 89:2679 (1992), wherein Sindbis virus constructs which express a 
truncated form of the influenza hemagglutinin protein are described. The 
constructs are used to study antigen processing and presentation in vitro and in 
mice. Although no infectious challenge dose is tested, it is also suggested that 
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such constructs might be used to produce protective B- and T-cell mediated 
immunity. 

London et al., Proc. Natl Acad: Sci, USA 89, 207-11 (1992) 
disclose a method of producing an immune response in mice against a lethal Rift 
Valley Fever (RVF) virus by infecting the mice with an infectious Sindbis virus 
contaimng an RVF epitope. London does not disclose using Girdwood S.A. or 
TR339 to induce an immune response in animals. 

VL-al carriers can also be used to introduce and express foreign 
DNA in eukaryotic cells. One goal of such techniques is to employ vectors that 
target expression to particular cells and/or tissues. A current approach has been 
to remove target cells from the body, culture them ex vivo, infect them with an 
expression vector, and then reintroduce them into the patient. 

PCT Publication No. WO 92/10578 to Garoff and Liljestrom 
provide a system for introducing and expressing foreign proteins in animal cells 
using alphaviruses. This reference discloses the use of Semliki Forest virus to 
introduce and express foreign proteins in animal cells. The use of Girdwood S.A 
or TR339 is not discussed. Furthermore, this reference does not provide a method 
of targeting and introducing foreign DNA into specific cell or tissue types. 

Accordingly, there remains a need in the art for full-length cDNA 
clones of positive-strand RNA viruses, such as Girdwood S.A and TR339. In 
addition, there is an ongoing need in the art for improved vaccination strategies. 
Finally, there remains a need in the art for improved methods and nucleic acid 
sequences for delivering foreign DNA to target cells. 



SUMMARY mr ttt e INVENTION 
A first aspect of the present invention is a method of introducing 
and expressing heterologous RNA in bone marrow cells, comprising: (a) providing 
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a recombinant alphavirus, the alphavirus containing a heterologous RNA segment, 
the heterologous RNA segment comprising a promoter operable in bone marrow 
cells operatively associated with a heterologous RNA to be expressed in bone 
marrow cells; and then (b) contacting the recombinant alphavirus to the bone 
marrow cells so that the heterologous RNA segment is introduced and expressed 
therein. 

As a second aspect, the present invention provides a helper cell for 
expressing an infectious, propagation defective, Girdwood S.A. virus particle, 
comprising, in a Girdwood S.A.-permissive cell: (a) a first helper RNA encoding 
(i) at least one Girdwood S.A. structural protein, and 00 not encoding at least one 
other Girdwood S.A. structural protein; and (b) a second helper RNA separate 
from the first helper RNA, the second helper RNA (i) not encoding the at least one 
Girdwood S..A. structural protein encoded by the first helper RNA, and fiij 
encoding the at least one other Girdwood S.A. structural protein not encoded by 
the first helper RNA, and with all of the Girdwood S.A. structural proteins 
encoded by the first and second helper RNAs assembling together into Girdwood 
S.A. particles in the cell containing the replicon RNA; and wherein the Girdwood 
S.A. packaging segment is deleted from at least the first helper RNA. 

A third aspect of the present invention is a method of making 
infectious, propagation defective, Girdwood S.A. virus particles, comprising: 
transfecting a Girdwood S.A.-permissive cell with a propagation defective replicon 
RNA, the replicon RNA including the Girdwood S.A. packaging segment and an 
inserted heterologous RNA; producing the Girdwood S.A. virus particles in the 
transfected cell; and then collecting the Girdwood S.A. virus particles from the 
cell. Also disclosed are infectious Girdwood S.A. RNAs, cDNAs encoding the 
same, infectious Girdwood S.A. virus particles, and pharmaceutical formulations 
thereof. 

As a fourth aspect, the present invention provides a helper cell for 
expressins an infectious, propagation defective, TR339 virus particle, comprising, 
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in a TR339-permissive ceU: (a) a first helper RNA encoding (i) at least one TR339 
structural protein, and (ii) not encoding at least one other TR339 structural protein; 
and (b) a second helper RNA separate from the first helper RNA, the second 
helper RNA (i) not encoding the at least one 111339 structural protein encoded by 
the first helper RNA, and (ii) encoding the at least one other TR339 structural 
protein not encoded by the first helper RNA, and with all of the TR339 structural 
proteins encoded by the first and second helper RNAs assembling together into 
TR339 particles in the cell containing the replicon RNA; and wherein the TR339 
packaging segment is deleted from at least the first helper RNA. 

A fifth aspect of the present invention is a method of making 
infectious, propagation defective, TR339 virus particles, comprising: transfecting 
a TR339-permissive cell with a propagation defective replicon RNA, the replicon 
RNA including the TR339 packaging segment and an inserted heterologous RNA; 
producing the TR339 virus particles in the transfected cell; and then collecting the 
TR339 virus particles from the cell. Also disclosed are infectious TR339 RNAs, 
cDNAs encoding the same, infectious TR339 virus particles, and pharmaceutical 
formulations thereof. 

As a sixth aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for .an infectious Girdwood S.A. virus RNA 
transcript, and a heterologous promoter positioned upstream from the cDNA and 
operatively associated therewith. The present invention also provides infectious 
RNA transcripts encoded by the above-mentioned cDNA and infectious viral 
particles containing the infectious RNA transcripts. 

As a seventh aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for a Sindbis strain TR339 RNA transcript, and 
a heterologous promoter positioned upstream from the cDNA and operatively 
associated therewith. The present invention also provides infectious RNA 
transcripts encoded by the above-mentioned cDNA and infectious viral particles 
containing the infectious RNA transcripts. 
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The foregoing and other aspects of the present invention are 
described in the detailed description set forth below. 

BRIEF T)F.SnRIFTIOW OF THE DRAWTiyeg 
Figure 1 presents the cDNA sequence (SEQ ID NO:l) of 
S.A.AR86. The RNA sequence of the 5' 40 nucleotides was obtained by direct 
sequencing of the genomic RNA. The rest of the genome was sequenced by RT- 
PCR of fragments amplified from virion RNA. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7559 (nsPl-nt60 through ntl679; nsP2-ntl680 through nt4099; nsP3- 
nt4100 through nt5729; nsP4-nt5730 through nt7559). the structural polyprotein 
is encoded by nucleotides 7608 through 1 1342 (capsid-nt7608 through nt8399; E3- 
-nt8400 through nt8591; E2-nt8592 through nt9860; 6K-nt9861 through ntl0025; 
El-ntl0026 through ntll342), and the 3' UTR is represented by nucleotides' 
11346 through 11663. 



Figure 1A shows nucleotides 1 through 3800 of the cDNA sequence 

of S.A.AR86. 



Figure IB shows nucleotides 3801 through 7900 of the cDNA 
sequence of S.A.AR86. 

Figure 1C shows nucleotides 7901 through 11663 of the cDNA 
sequence of S.A.AR86. 

Figure 2 presents the putative amino acid sequences of the 
S.A.AR86 polyproteins (SEQ ID NO:2 and SEQ ID NO:3). The amino acids 
were derived from the S.A.AR86 cDNA sequence given in Figure 1 (SEQ ID 
NO:l). 
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Figure 2A shows the amino acid sequence of the non-structural 
polyprotein of S.A.AR86 (SEQ ID NO:2). 

Figure 2B shows the amino acid sequence of the structural 
polyprotein of S.A.AR86 (SEQ ID NO:3). 

Figure 3 presents the cDNA sequence (SEQ ID NO:4) of Girdwood 
S.A. The RNA sequence of the 5' 40 nucleotides was obtained by direct 
sequencing of the genomic RNA. The rest of the genome sequence was obtained 
by sequencing of fragments amplified by RT-PCR from virion RNA. An "N" in 
the sequence indicates that the identity of the nucleotide at that position is 
unknown. Nucleotides 1 through 59 represent the 5' UTR, the non-structural 
polyprotein is encoded by nucleotides 60 through 7613 (nsPl-nt60 through 
ntl679; nsP2-ntl680 through nt4099; nsP3-nt4100 through nt5762 or nt5783; 
nsP4-nt5784 through nt7613), the structural polyprotein is encoded by nucleotides 
7662 through 11396 (capsid-nt7662 through nt8453; E3-nt8454 through nt8645; 
15 E2-nt8646 through nt9914, 6K-9915 through ntl0079; El-ntl0080 through 
ntll396), and the 3' UTR is represented by nucleotides 11400 through 11717. 
There is an opal termination codon at nucleotides 5763 through 5765. 

Figure 3A shows nucleotides 1 through 3800 of the cDNA sequence 
of Girdwood S.A. 



20 Figure 3B shows nucleotides 3801 through 7900 of the cDNA 

sequence of Girdwood S.A. 

Figure 3C shows nucleotides 7901 through 11717 of the cDNA 
sequence of Girdwood S.A. 

Figure 4 illustrates the putative amino acid sequences of the 
25 Girdwood S.A. polyproteins (SEQ ID NO:5 and SEQ ID NO:6). The amino 
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acids were derived from the Girdwood S.A. cDNA sequence given in Figure 3 
(SEQ ID NO:4). 

Figure 4A shows the amino acid sequence of the non-structural 
polyprotein of Girdwood S.A. The sequence terminates at the opal termination 
codon. The complete amino acid sequence is presented in SEQ ID NO:5. 

Figure 4B shows the amino acid sequence of the structural 
polyprotein of Girdwood S.A, (SEQ ID NO:6). 

Figure 5 illustrates the nucleotide sequence (SEQ ID NO:7) of 
clone pS55, a cDNA clone of the S.A.AR86 genomic RNA. 

Figure 5 A shows nucleotides 1 through 6720 of the cDNA sequence 

of pS55. 

Figure SB shows nucleotides 6721 through 11663 of the cDNA 
sequence of pS55. 

Figure 6 presents the cDNA sequence (SEQ ID NO:8) of clone 
pTR339. The TR339 virus is derived from this clone. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7598 (nsPl-nt60 through ntl679; nsP2--ntl680 through nt4099; nsP3- 
nt4100 through nt5747 or 5768; nsP4~nt5769 through nt7598), the structural 
polyprotein is encoded by nucleotides 7647 through 11381 (capsid-nt7647 through 
nt8438; E3-nt8439 through nt8630; E2~nt8631 through nt9899; 6K-nt9900 
through ntl0064; El-ntl0065 through ntll381), and the 3' UTR is represented 
by nucleotides 11382 through 11703. There is an opal termination codon at 
nucleotides 5748 through 5750. 

Figure 6A shows nucleotides 1 through 6720 of the cDNA sequence 

ofpTR339. 
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Figure 6B shows nucleotides 6721 through 11703 of the cDNA 
sequence of pTR339. 

DETAILED DE SCRIPTION OF THE INVfiNTTriM 
The production and use of recombinant *DNA , vectors, transformed 
host cells, selectable markers, proteins, and protein fragments by genetic 
engineering are well-known to those skilled in the art. See, e.g., United States 
Patent No. 4,761,371 to Bell et al. at Col. 6 line 3 to Col. 9 line 65; United States 
Patent No. 4,877, 729 to Clark et al. at CoL 4 line 38 to Col. 7 line 6; United 
States Patent No. 4,912,038 to Schilling at Col 3 line 26 to Col 14 line 12; and 
United States Patent No. 4,879,224 to Wallner at Col. 6 line 8 to Col. 8 line 59. 

The term "alphavirus" has its conventional meaning in the art, and 
includes the various species of alphaviruses such as Eastern Equine Encephalitis 
virus (EEE), Venezuelan Equine Encephalitis virus (VEE), Everglades virus, 
Mucambo virus, Pixuna virus, Western Encephalitis virus (WEE), Sindbis virus, 
South African Arbovirus No. 86, Girdwood S.A. virus, Ockelbo virus, Semliki 
Forest virus, Middelburg virus, Chikungunya virus, O'Nyong-Nyong virus, Ross 
River virus, Barmah Forest virus, Getah virus, Sagiyama virus, Bebaru virus, 
Mayaro virus, Una virus, Aura virus, Whataroa virus, Babanki virus, Kyzlagach 
virus, Highlands J virus, Fort Morgan virus, Ndumu virus, Buggy Creek virus, 
and any other virus classified by the International Committee on Taxonomy of 
Viruses (ICTV) as an alphavirus. The preferred alphaviruses for use in the present 
invention include Sindbis virus strains (e.g. , TR339), Girdwood S.A., S.A.AR86, 
and Ockelbo82. 

An "Old World alphavirus " is a virus that is primarily distributed 
throughout the Old World. Alternately stated, an Old World alphavirus is a virus 
that is primarily distributed throughout Africa, Asia, Australia and New Zealand, 
or Europe. Exemplary Old World viruses include SF group alphaviruses and SIN 
group alphaviruses. SF group alphaviruses include Semliki Forest virus, 
Middelburg virus, Chikungunya virus, O'Nyong-Nyong virus, Ross River virus, 
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Bannah Forest virus, Getah virus, Sagiyama virus, Bebaru virus, Mayaro virus, 
and Una virus. SIN group alphaviruses include Sindbis virus, South African 
Arbovirus No. 86, Ockelbo virus, Girdwood S.A. virus, Aura virus, Whataroa 
virus, Babanki virus, and Kyzylagach virus. 

5 Acceptable alphaviruses include those containing attenuating 

mutations. The phrases "attenuating mutation" and "attenuating amino acid," as 
used herein, mean a nucleotide sequence containing a mutation, or an amino acid 
encoded by a nucleotide sequence containing a mutation, which mutation results 
in a decreased probability of causing disease in its host (/. e. , a loss of virulence), 
10 in accordance with standard terminology in the art, whether the mutation be a 
substitution mutation or an in-frame deletion mutation. See, e.g., B. DAVIS ET 
AL., MICROBIOLOGY 132 (3d ed. 1980). The phrase "attenuating mutation- 
excludes mutations or combinations of mutations which would be lethal to the 
virus. 



Appropriate attenuating mutations will be dependent upon the 
alphavirus used. Suitable attenuating mutations within the alphavirus genome will 
be known to those skilled in the art. Exemplary attenuating mutations include, but 
are not limited to, those described in United States Patent No. 5,505,947 to 
Johnston et al., copending United States application 08/448,630 to Johnston et al., 
and copending United States application 08/446,932 to Johnston et al. It is 
intended that all United States patent references be incorporated in their entirety 
by reference. 



Attenuating mutations may be introduced into the RNA by 
performing site-directed mutagenesis on the cDNA which encodes the RNA, in 
accordance with known procedures. See, Kunkel, Proc. Natl. Acad. Sci. USA 82, 
488 (1985), the disclosure of which is incorporated herein by reference in its 
entirety. Alternatively, mutations may be introduced into the RNA by replacement 
of homologous restriction fragments in the cDNA which encodes for the RNA, in 
accordance with known procedures. 
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I. Methpds for Introducing and Exnn,«in„ TT^ o lopnn< plvrA .„ ^ 
Marrow Cells . 

The present invention provides methods of using a recombinant 
alphavirus to introduce and express a heterologous RNA in bone marrow cells. 
Such methods are useful as vaccination strategies when the heterologous RNA 
encodes an immunogenic protein or peptide. Alternatively, such methods are 
useful in introducing and expressing in bone marrow cells an RNA which encodes 
a desirable protein or peptide, for example, a therapeutic protein or peptide. 

The present invention is carried out using a recombinant alphavirus 
to introduce a heterologous RNA into bone marrow cells. Any alphavirus that 
targets and infects bone marrow cells is suitable. Preferred alphaviruses include 
Old World alphaviruses, more preferably SF group alphaviruses and SIN group 
alphaviruses, more preferably Sindbis virus strains (e.*., TR339), S.A.AR86 
virus, Girdwood S. A. virus, and Ockelbo virus. In a more preferred embodiment, 
the alphavirus contains one or more attenuating mutations, as described 
hereinabove. 

Two types of recombinant virus vector are contemplated in carrying 
out the present invention. In one embodiment employing "double promoter 
vectors," the heterologous RNA is inserted into a replication and propagation 
competent virus. Double promoter vectors are described in United States Patent 
No. 5,505,947 to Johnston et al. With this type of viral vector, it is preferable 
that heterologous RNA sequences of less than 3 kilobases are inserted into the viral 
vector, more preferably those less than 2 kilobases, and more preferably still those 
less than 1 kilobase. In an alternate embodiment, propagation-defective "replicon 
vectors," as described in copending United States application 08/448.630 to 
Johnston et al.. will be used. One advantage of replicon viral vectors is that larger 
RNA inserts, up to approximately 4-5 kilobases in length can be utilized. Double 
promoter vectors and replicon vectors are described in more detail hereinbelow. 
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The recombinant alphaviruses of the claimed method target the 
heterologous RNA to bone marrow cells, where it expresses the encoded protein or 
peptide. Heterologous RNA can be introduced and expressed in any cell type found in the 
bone marrow. Bone marrow cells that may be targeted by the recombinant alphaviruses of 
5 the present invention include, but are not limited to, polymorphonuclear cells, hemopoietic 
stem cells (including megakaryocyte colony forming units (CFU-M), spleen colony 
forming units (CFU-S), erythroid colony forming units (CFU-E), erythroid burst forming 
units (BFU-E), and colony forming units in culture (CFU-C), erythrocytes, macrophages 
(including reticular cells), monocytes, granulocytes, megakaryoctyes, lymphocytes, 
10 fibroblasts, osteoprogenitor cells, osteoblasts, osteoclasts, marrow stromal cells, 
chondrocytes and other cells of synovial joints. Preferably, marrow cells within the 
endosteum are targeted, more preferably osteoblasts. Also preferred are methods in which 
cells in the endosteum of synovial joints (e.g. , hip and knee joints) are targeted. 

By targeting to the cells of the bone marrow, it is meant that the primary 
15 site in which the virus will be localized in vivo is the cells of the bone marrow. 
Alternately stated, the alphaviruses of the present invention target bone marrow cells, such 
that titers in bone marrow two days after infection are greater than 100 PFU/g crushed 
bone, preferably greater than 200 PFU/g crushed bone, more preferably greater than 300 
PFU/g crushed bone, and more preferably still greater than 500 PFU/g crushed bone. 
20 Virus may be detected occasionally in other cell or tissue types, but only sporadically and 
usually at low levels. Virus localization in the bone marrow can be demonstrated by any 
suitable technique known in the art, such as in situ hybridization. 

Bone marrow cells are long-lived and harbor infectious alphaviruses for a 
prolonged period of time, as demonstrated in the Examples below. These characteristics 
25 of bone marrow cells render the present invention useful not only for the purpose of 
supplying a desired protein or peptide to skeletal tissue, but also for expressing proteins or 
peptides in vivo that are needed by other cell or tissue types. 

The present invention can be carried out in vivo or with cultured bone 
marrow cells in vitro. Bone marrow cell cultures include primary cultures 
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of bone marrow cells, serially-passaged cultures of bone marrow cells, and 
cultures of immortalized bone marrow cell lines. Bone marrow cells may be 
cultured by any suitable means known in the art. 

The recombinant alphaviruses of the present invention cany a 
5 heterologous RNA segment. The heterologous RNA segment encodes a promoter 
and an inserted heterologous RNA. The inserted heterologous RNA may encode 
any protein or a peptide which is desirably expressed by the host bone marrow 
cells. Suitable heterologous RNA may be of prokaryotic (e.g. , RNA encoding the 
Botulinus toxin C), or eukaryotic (e.g., RNA encoding malaria Plasmodium 

10 protein csl) origin. Illustrative proteins and peptides encoded by the heterologous 
RNAs of the present invention include hormones, growth factors, interleukins, 
cytokines, chemokines, enzymes, and ribozymes. Alternately, the heterologous 
RNAs encode any therapeutic protein or peptide. As a further alternative, the 
heterologous RNAs of the present invention encode any immunogenic protein or 

15 peptide. 

An immunogenic protein or peptide, or "immunogen," may be any 
protein or peptide suitable for protecting the subject against a disease, including 
but not limited to microbial, bacterial, protozoal, parasitic, and viral diseases. For 
example, the immunogen may be an orthomyxovirus -immunogen (e.g., an 

20 influenza virus immunogen, such as the influenza virus hemagglutinin (HA) 
surface protein or the influenza virus nucleoprotein gene, or an equine influenza 
virus immunogen), or a lentivinis immunogen (e.g. t an equine infectious anemia 
virus immunogen, a Simian Immunodeficiency Virus (SIV) immunogen, or a 
Human Immunodeficiency Virus (HIV) immunogen, such as the HIV envelope 

25 GP160 protein and the HIV matrix/capsid proteins). The immunogen may also be 
an arenavirus immunogen (e.g. , Lassa fever virus immunogen, such as the Lassa 
fever virus nucleocapsid protein gene and the Lassa fever envelope glycoprotein 
gene), a poxvirus immunogen (e.g., vaccinia), a flavivirus immunogen (e.g., a 
yellow fever virus immunogen or a Japanese encephalitis virus immunogen), a 

30 filovirus immunogen (e.g., an Ebola virus immunogen, or a Marburg virus 
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immunogen), a bunyavirus immunogen (e.g., RVFV, CCHF, and SFS viruses), 
or a coronavirus immunogen (e.g., an infectious human coronavirus immunogen, 
such as the human coronavirus envelope glycoprotein gene, or a transmissible 
gastroenteritis virus immunogen for pigs, or an infectious bronchitis virus 
5 immunogen for chickens). 

Alternatively, the present invention can be used to express 
heterologous RNAs encoding antisense oligonucleotides. In general, "antisense" 
refers to the use of small, synthetic oligonucleotides to inhibit gene expression by 
inhibiting the function of the target mRNA containing the complementary 

10 sequence. MUligan, J.F. et al., J. Med. Chem. 36(14), 1923-1937 (1993). Gene 
expression is inhibited through hybridization to coding (sense) sequences in a 
specific mRNA target by hydrogen bonding according to Watson-Crick base 
pairing rules. The mechanism of antisense inhibition is that the exogenously 
applied oligonucleotides decrease the mRNA and protein levels of the target gene. 

15 Milligan, J.F. et al., J. Med. Chem. 36(14), 1923-1937 (1993). See also Helene, 
C. and Toulme, J., Biochim. Biophys. Acta 1049, 99-125 (1990); Cohen, J.S., 
Ed., OUGODEOXYNUCLEOTIDES AS ANTISENSE INHIBITORS OF GENE 
EXPRESSION, CRC Press:Boca Raton, FL (1987). 

Antisense oligonucleotides may be of any suitable length, depending 
20 on the particular target being bound. The only limits on the length of the antisense 
oligonucleotide is the- capacity of the virus for inserted heterologous RNA. 
Antisense oligonucleotides may be complementary to the entire mRNA transcript 
of the target gene or only a portion thereof. Preferably the antisense 
oligonucleotide is directed to an mRNA region containing a junction between 
25 intron and exon. Where the antisense oligonucleotide is directed to an intron/exon 
junction, it may either entirely overlie the junction or may be sufficiently close to 
the junction to inhibit splicing out of the intervening exon during processing of 
precursor mRNA to mature mRNA (e.g., with the 3* or 5' terminus of the 
antisense oligonucleotide being positioned within about, for example, 10, 5, 3 or 
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2 nucleotides of the intron/exon junction). Also preferred are antisense 
oligonucleotides which overlap the initiation codon. 

When practicing the present invention, the antisense oligonucleotides 
administered may be related in origin to the species to which it is administered. 
5 When treating humans, human antisense may be used if desired. 

Promoters for use in carrying out the present invention are operable 
in bone marrow cells. An operable promoter in bone marrow cells is a promoter 
that is recognized by and functions in bone marrow cells. Promoters for use with 
the present invention must also be operatively associated with the heterologous 
RNA to be expressed in the bone marrow. A promoter is operably linked to a 
heterologous RNA if it controls the transcription of the heterologous RNA, where 
the heterologous RNA comprises a coding sequence. Suitable promoters are well 
known in the art. The Sindbis 26S promoter is preferred when the alphavims is 
a strain of Sindbis virus. Additional preferred promoters beyond the Sindbis 26S 
promoter include the Girdwood S.A. 26S promoter when the alphavirus is 
Girdwood S.A., the S.A.AR86 26S promoter when the alphavirus is S.A.AR86, 
and any other promoter sequence recognized by alphavims polymerases. 
Alphavirus promoter sequences containing mutations which alter the activity level 
of the promoter (in relation to the activity level of the wild-type) are also suitable 
in the practice of the present invention. Such mutant promoter sequences are 
described in Raju and Huang, Virol. 65, 2501-2510 (1991), the disclosure of 
which is incorporated in its entirety by reference. 

The heterologous RNA is introduced into the bone marrow cells by 
contacting the recombinant alphavirus carrying the heterologous RNA segment to 
25 the bone marrow cells. By contacting, it is meant bringing the recombinant 
alphavirus and the bone marrow cells in physical proximity. The contacting step 
can be performed in vitro or in vivo. In vitro contacting can be carried out with 
cultures of immortalized or non-immortalized bone marrow cells. In one particular 
embodiment, bone marrow cells can be removed from a subject, cultured in vitro, 
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infected with the vector, and then introduced back into the subject. Contacting is 
performed, in vivo when the recombinant alphavims is administered to a subject. 
Pharmaceutical formulations of recombinant alphavims can be aclniinistered to a 
subject parenteral^ (e.g., subcutaneous, intracerebral, intradermal, intramuscular, 
intravenous and intraarticular) administration. Alternatively, pharmaceutical 
formulations of the present invention may be suitable for atlministration to the 
mucus membranes of a subject (e.g., intranasal adnunistration. by use of a 
dropper, swab, or inhaler). Methods of preparing infectious virus particles and 
pharmaceutical formulations thereof are discussed in more detail hereinbelow. 

By "introducing" the heterologous RNA segment into the bone 
marrow cells it is meant infecting the bone marrow cells with recombinant 
alphavims containing the heterologous RNA, such that the viral vector carrying the 
heterologous RNA enters the bone marrow cells and can be expressed therein. As 
used with respect to the present invention, when the heterologous RNA is 
"expressed," it is meant that the heterologous RNA is transcribed. In particular 
embodiments of the invention in which it is desired to produce a protein or 
peptide, expression further includes the steps of post-transcriptional processing and 
translation of the mRNA transcribed from the heterologous RNA. In contrast, 
where the heterologous RNA encodes an antisense oligonucleotide, expression need 
not include post-transcriptional processing and translation. With respect to 
embodiments in which the heterologous RNA encodes an immunogenic protein or 
a protein being administered for therapeutic purposes, expression may also include 
the further step of post-translational processing to produce an immunogenic or 
therapeutically-active protein. 



25 1116 P rcsent invention also provides infectious RNAs, as described 

hereinabove, and cDNAs encoding the same. Preferably the infectious RNAs and 
cDNAs are derived from the S.A.AR86, Girdwood S.A., TR339, or Ockelbo 
viruses. The cDNA clones can be generated by any of a variety of suitable 
methods known to those skilled in the art. A preferred method is the method set 
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forth in United States Patent No. 5,185,440 to Davis et al., the disclosure of which 
is incorporated in its entirety by reference, and Gubler et al., Gene 25:263 (1983). 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized intracellularly after introduction of the cDNA. 



A. Double Promoter Vectors. 

In one embodiment of the invention, double promoter vectors are 
used to introduce the heterologous RNA into the target bone marrow cells. A 
double promoter virus vector is a replication and propagation competent virus. 
Double promoter vectors are described in United States Patent No. 5,505,947 to 
Johnston et al., the disclosure of which is incorporated in its entirety by reference. 
Preferred alphaviruses for constructing the double promoter vectors are S.A.AR86, 
Girdwood S.A., TR339 and Ockelbo viruses. More preferably, the double 
promoter vector contains one or more attenuating mutations. Attenuating 
mutations are described in more detail hereinabove. 

The double promoter vector is constructed so as to contain a second 
subgenomic promoter (i.e., 26S promoter) inserted 3' to the virus RNA encoding 
the structural proteins. The heterologous RNA is inserted between the second 
subgenomic promoter, so as to be operatively associated therewith, and the 3' 
UTR of the virus genome. Heterologous RNA sequences of less than 3 kilobases, 
more preferably those less than 2 kilobases, and more preferably still those less 
than 1 kilobase, can be inserted into the double promoter vector. In a preferred 
embodiment of the invention, the double promoter vector is derived from 
Girdwood S.A., and the second subgenomic promoter is a duplicate of the 
Girdwood S.A. subgenomic promoter. In an alternate preferred embodiment, the 
double promoter vector is derived from TR339, and the second subgenomic 
promoter is a duplicate of the TR339 subgenomic promoter. 
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B. Replicon Vectors. 

Replicon vectors, which are propagation-defective virus vectors can 
also be used to carry out the present invention. Replicon vectors are described in 
more detail in copending United States Application 08/448,630 to Johnston et aL, 
the disclosure of which is incorporated in its entirety by reference. Preferred 
alphaviruses for constructing the replicon vectors are S.A.AR86, Girdwood S.A., 
TR339, and Ockelbo. 



In general, in the replicon system, a foreign gene to be expressed 
is inserted in place of at least one of the viral structural protein genes in a 
transcription plasmid containing an otherwise full-length cDNA copy of the 
alphavirus genome RNA. RNA transcribed from this plasmid contains an intact 
copy of the viral nonstructural genes which are responsible for RNA replication 
and transcription. Thus, if the transcribed RNA is transfected into susceptible 
cells, it will be replicated and translated to give the nonstructural proteins. These 
proteins will transcribe the transfected RNA to give high levels of subgenomic 
mRNA, which will then be translated to produce high levels of the foreign protein. 
The autonomously replicating RNA {i.e. , replicon) can only be packaged into virus 
particles if the alphavirus structural protein genes are provided on one or more 
"helper" RNAs, which are cotransfected into cells along with the replicon RNA. 
The helper RNAs do not contain the viral nonstructural genes for replication, but 
these functions are provided in trans by the replicon RNA. Similarly, the 
transcriptase functions translated from the replicon RNA transcribe the structural 
protein genes on the helper RNA, resulting in the synthesis of viral structural 
proteins and packaging of the replicon into virus-like particles. As the packaging 
or encapsidation signal for alphavirus RNAs is located within the nonstructural 
genes, the absence of these sequences in the helper RNAs precludes their 
incorporation into vims particles. 



Alphavirus-pennissive cells employed in the methods of the present 
invention are cells which, upon transfection with the viral RNA transcript, are 
capable of producing viral particles. Preferred alphavirus-pennissive cells are 
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TR339-permissive cells, Girdwood S.A.-pennissive cells, S.A.AR86-permissiveceIls t and 
Ockelbo-permissive cells. Alphaviruses have a broad host range. Examples of suitable 
host cells include, but are not limited to Vero cells, baby hamster kidney (BHK) cells, and 
chicken embryo fibroblast cells. 

5 The phrase "structural protein" as used herein refers to the encoded 

proteins which are required for encapsidation {e.g. , packaging) of the RNA replicon, and 
include the capsid protein, El glycoprotein, and E2 glycoprotein. As described 
hereinabove, the structural proteins of the alphavirus are distributed among one or more 
helper RNAs (i.e., a first helper RNA and a second helper RNA). In addition, one or 

10 more structural proteins may be located on the same RNA molecule as the replicon RNA, 
provided that at least one structural protein is deleted from the replicon RNA such that the 
resulting alphavirus particle is propagation defective. As used herein, the terms "deleted" 
or "deletion" mean either total deletion of the specified segment or the deletion of a 
sufficient portion of the specified segment to render the segment inoperative or 

15 nonfunctional, in accordance with standard usage. See, e.g., U.S. Patent No. 4,650,764 
to Temin et al. The term "propagation defective" as used herein, means that the replicon 
RNA cannot be encapsidated in the host cell in the absence of the helper RNA. The 
resulting alphavirus replicon particles are propagation defective inasmuch as the replicon 
RNA in these particles does not include all of the alphavirus structural proteins required 

20 for encapsidation, at least one of the required structural proteins being deleted therefrom, 
such that the replicon RNA initiates only an abortive infection; no new viral particles are 
produced, and there is no spread of the infection to other cells. 

The helper cell for expressing the infectious, propagation defective alphavirus 
particle comprises a set of RNAs, as described above. The set of RNAs principally 
25 include a first helper RNA and a second helper RNA. The first helper RNA includes 
RNA encoding at least one alphavirus structural protein but does not encode all alphavirus 
structural proteins. In other words, the first helper RNA does not encode at least one 
alphavirus structural protein; the at least one non-coded alphavirus structural protein being 
deleted from the first helper RNA. 
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In one embodiment, the first helper RNA includes RNA encoding the alphavirus 
El glycoprotein, with the alphavirus capsid protein and the alphavirus E2 
glycoprotein being deleted from the first helper RNA. In another embodiment, the 
first helper RNA includes RNA encoding the alphavirus E2 glycoprotein, with the 
alphavirus capsid protein and the alphavirus El glycoprotein being deleted from 
the first helper RNA. In a third, preferred embodiment, the first helper RNA 
includes RNA encoding the alphavirus El glycoprotein and the alphavirus E2 
glycoprotein, with the alphavirus capsid protein being deleted from the first helper 



RNA. 



The second helper RNA includes RNA encoding at least one 
alphavirus structural protein which is different from the at least one structural 
protein encoded by the first helper RNA. Thus, the second helper RNA encodes 
at least one alphavirus structural protein which is not encoded by the first helper 
RNA. The second helper RNA does not encode the at least one alphavirus 
structural protein which is encoded by the first helper RNA, thus the first and 
second helper RNAs do not encode duplicate structural proteins. In the 
embodiment wherein the first helper RNA includes RNA encoding only the 
alphavirus El glycoprotein, the second helper RNA may include RNA encoding 
one or both of the alphavirus capsid protein and the alphavirus E2 glycoprotein 
which are deleted from the first helper RNA. In the embodiment wherein, the first 
helper RNA includes RNA encoding only the alphavirus E2 glycoprotein, the 
second helper RNA may include RNA encoding , one or both of the alphavirus 
capsid protein and the alphavirus El glycoprotein which are deleted from the first 
helper RNA. In the embodiment wherein the first helper RNA includes RNA 
encoding both the alphavirus El glycoprotein and the alphavirus E2 glycoprotein, 
the second helper RNA may include RNA encoding the alphavirus capsid protein 
which is deleted from the first helper RNA. 

In one embodiment, the packaging segment (RNA comprising the 
encapsidation or packaging signal) is deleted from at least the first helper RNA. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/36779 PCT7US98/02945 

-22- 

In a preferred embodiment, the packaging segment is deleted from both the first 
helper RNA and the second helper RNA. 



In the preferred embodiment wherein the packaging segment is 
deleted from both the first helper RNA and the second helper RNA, the helper cell 
is co-transfected with a replicon RNA in addition to the first helper RNA and the 
second helper RNA. The replicon RNA encodes the packaging segment and an 
inserted heterologous RNA. The inserted heterologous RNA may be RNA 
encoding a protein or a peptide. In a preferred embodiment, the replicon RNA, 
the first helper RNA and the second helper RNA are provided on separate' 
molecules such that a first molecule, /.*., the replicon RNA, includes RNA 
encoding the packaging segment and the inserted heterologous RNA, a second 
molecule, i.e. , the first helper RNA, includes RNA encoding at least one but not 
all of the required alphavirus structural proteins, and a third molecule, i.e., the 
second helper RNA, includes RNA encoding at least one but not all of the required 
alphavirus structural proteins. For example, in one preferred embodiment of the 
present invention, the helper cell includes a set of RNAs which include (a) a 
replicon RNA including RNA encoding an alphavirus packaging sequence and an 
inserted heterologous RNA, (b) a first helper RNA including RNA encoding the 
alphavirus El glycoprotein and the alphavirus E2 glycoprotein, and (c) a second 
helper RNA including RNA encoding the alphavirus capsid protein so that the 
'alphavirus El glycoprotein, the alphavirus E2 glycoprotein and the capsid protein 
assemble together into alphavirus particles in the host cell. 

In an alternate embodiment, the replicon RNA and the first helper 
RNA are on separate molecules, and the replicon RNA and RNA encoding a 
structural gene not encoded by the first helper RNA are on another single molecule 
together, such that a first molecule, i.e., the first helper RNA, including RNA 
encoding at least one but not all of the required alphavirus structural proteins, and 
a second molecule, i. e. , the replicon RNA, including RNA encoding the packaging 
segment, the inserted heterologous RNA, and the remaining structural proteins not 
encoded by the first helper RNA. For example, in one preferred embodiment of 
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the present invention, the helper cell includes a set of RNAs including (a) a 
replicon RNA including RNA encoding an alphavirus packaging sequence, an 
inserted heterologous RNA, and an alphavirus capsid protein, and (b) a first helper 
RNA including RNA encoding the alphavirus El glycoprotein and the alphavirus 
E2 glycoprotein so that the alphavirus El glycoprotein, the alphavirus E2 
glycoprotein and the capsid protein assemble together into alphavirus particles in 
the host cell, with the replicon RNA packaged therein. 



10 



15 



In one preferred embodiment of the present invention, the RNA 
encoding the alphavirus structural proteins, U., the capsid, El glycoprotein and 
E2 glycoprotein, contains at least one attenuating mutation, as described 
hereinabove. Thus, according to this embodiment, at least one of the first helper 
RNA and the second helper RNA includes at least one attenuating mutation. In 
a more preferred embodiment, at least one of the first helper RNA and the second 
helper RNA includes at least two, or multiple, attenuating mutations. The multiple 
attenuating mutations may be positioned in either the first helper RNA or in the 
second helper RNA, or they may be distributed randomly with one or more 
attenuating mutations being positioned in the first helper RNA and one or more 
attenuating mutations positioned in the second helper RNA. Alternatively, when 
the replicon RNA and the RNA encoding the structural proteins not encoded by 
the first helper RNA are located on the same molecule, an attenuating mutation 
may be positioned in the RNA which codes for the structural protein not encoded 
by the first helper RNA. The attenuating mutations may also be located within the 
RNA encoding non-structural proteins {e.g., the replicon RNA). 

Preferably, the first helper RNA and the second helper RNA also 
25 include a promoter. It is also preferred that the replicon RNA also includes a 
promoter. Suitable promoters for inclusion in the first helper RNA, second helper 
RNA and replicon RNA are well known in the art. One preferred promoter is the 
Girdwood S.A. 26S promoter for use when the alphavirus is Girdwood S.A. 
Another preferred promoter is the TR339 26S promoter for use when the 
30 alphavirus is TR339. Additional promoters beyond the Girdwood S.A. and TR339 
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promoters include the VEE 26S promoter, the Sindbis 26S promoter, the Semliki 
Forest 26S promoter, and any other promoter sequence recognized by alphavirus 
polymerases. Alphavirus promoter sequences containing mutations which alter the 
activity level of the promoter (in relation to the activity level of the wild-type) are 
also suitable in the practice of the present invention. Such mutant promoter 
sequences are described in Raju and Huang, /. Virol. 65, 2501-2510 (1991), the 
disclosure of which is incorporated herein in its entirety. In the system wherein 
the first helper RNA, the second helper RNA, and the replicon RNA are all on 
separate molecules, the promoters, if the same promoter is used for all three 
RNAs, provide a homologous sequence between the three molecules. It is 
preferred that the selected promoter is operative with the non-structural proteins 
encoded by the replicon RNA molecule. 



15 



In cases where vaccination with two immunogens provides improved 
protection against disease as compared to vaccination with only a single 
immunogen, a double-promoter replicon would ensure that both immunogens are 
produced in the same cell. Such a replicon would be the same as the one 
described above, except that it would contain two copies of the 26S RNA 
promoter, each followed by a different multiple cloning site, to allow for the 
insertion and expression of two different heterologous proteins. Another useful 
20 strategy is to insert the IRES sequence from the picomavirus, EMC virus, between 
the two heterologous genes downstream from the single 26S promoter of the 
replicon described above, thus leading to expression of two immunogens from the 
single replicon transcript in the same cell. 



25 



C. Uses of th e Present Invpntinn 

The alphavirus vectors, RNAs, cDNAs, helper cells, infectious virus 
particles, and methods of the present invention find use in in vitro expression 
systems, wherein the inserted heterologous RNA encodes a protein or peptide 
which is desirably produced in vitro. The RNAs, cDNAs, helper cells, infectious 
virus particles, methods, and pharmaceutical formulations of the present invention 
30 are additionally useful in a method of administering a protein or peptide to a 
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subject in need of the protein or peptide, as a method of treatment or otherwise. 
In this embodiment of the invention, the heterologous RNA encodes the desired 
protein or peptide, and pharmaceutical formulations of the present invention are 
adininistered to a subject in need of the desired protein or peptide. In this manner, 
5 the protein or peptide may thus be produced in vivo in the subject. The subject 
may be in need of the protein or peptide because the subject has a deficiency 
thereof, or because the production of the protein or peptide in the subject may 
impart some therapeutic effect, as a method of treatment or otherwise. 

Alternately, the claimed methods provide a vaccination strategy, 
10 wherein the heterologous RNA encodes an immunogenic protein or peptide. 

The methods and products of the invention are also useful as 
antigens and for evoking the production of antibodies in animals such as horses 
and rabbits, from which the antibodies may be collected and then used in 
diagnostic assays in accordance with known techniques, 

15 A footer aspect of the present invention is a method of introducing 

and expressing antisense oligonucleotides in bone marrow cell cultures to regulate 
gene expression. Alternately, the claimed method finds use in introducing and 
expressing a protein or peptide in bone marrow cell cultures. 

II. Girdwbo d S.A. and TR339 Clones. 
20 Disclosed hereinbelow are genomic RNA sequences encoding live 

Girdwood S.A. virus, live S.A.AR86 virus, and live Sindbis strain TR339 virus, 
cDNAs derived therefrom, infectious RNA transcripts encoded by the cDNAs, 
infectious viral particles containing the infectious RNA transcripts, and 
pharmaceutical formulations derived therefrom. 

25 The cDNA sequence of Girdwood S.A. is given herein as SEQ ED 

NO:4. Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ED NO:4, but which has the same protein sequence as the cDNA 
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Thus. the cDNA may include one or more silent 



The phrase "silent mutation" as used herein refers to mutations in 
the cDNA coding sequence which do not produce mutations in the corresponding 
5 protein sequence translated therefrom. 

Likewise, the cDNA sequence of TR339 is given herein as SEQ ID 
NO:8. Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ID NO:8, but which has the same protein sequence as the cDNA 
given herein as SEQ ID NO:8. Thus, the cDNA may include one or more silent 
10 mutations. 

The cDNAs encoding infectious Girdwood S.A. and TR339 virus 
RNA transcripts of the present invention include those homologous to, and having 
essentially the same biological properties as, the cDNA sequences disclosed herein 
as SEQ ID NO:4 and SEQ ID NO:8, respectively. Thus, cDNAs that hybridize 

15 to cDNAs encoding infectious Girdwood S.A. or TR339 virus RNA transcripts 
disclosed herein are also an aspect of this invention. Conditions which will permit 
other cDNAs encoding infectious Girdwood S.A. or TR339 virus transcripts to 
hybridize to the cDNAs disclosed herein can be determined in accordance with 
known techniques. For example, hybridization of such sequences may be carried 

20 out under conditions of reduced stringency, medium stringency, or even high 
stringency conditions (e.g. , conditions represented by a wash stringency of 35-40% 
formamide with 5X Denhardt's solution, 0.5% SDS and IX SSPE at 37°C; 
conditions represented by a wash stringency of 40-45% formamide with 5X 
Denhardt's solution, 0.5% SDS, and IX SSPE at 42 °C; and conditions represented 

25 by a wash stringency of 50% formamide with 5X Denhardt's solution, 0.5% SDS 
and IX SSPE at 42°C, respectively, to cDNA encoding infectious Girdwood S.A. 
or TR339 virus RNA transcripts disclosed herein in a standard hybridization assay. 
See J. SAMBROOK ET AL., MOLECULAR CLONING: A LABORATORY 
MANUAL (2d ed. 1989)). In general, cDNA sequences encoding infectious 
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Girdwood S.A. or TR339 virus RNA transcripts that hybridize to the cDNAs 
disclosed herein will be at least 30% homologous, 50% homologous, 75% 
homologous, and even 95% homologous or more with the cDNA sequences 
encoding infectious Girdwood S.A. or TR339 virus RNA transcripts disclosed 
5 herein. 



10 



Promoter sequences and Girdwood S.A. virus or Sindbis virus strain 
TR339 cDNA clones are operatively associated in the present invention such that 
the promoter causes the cDNA clone to be transcribed in the presence of an RNA 
polymerase which binds to the promoter. The promoter is positioned on the 5' end 
(with respect to the virion RNA sequence), of the cDNA clone. An excessive 
number of nucleotides between the promoter sequence and the cDNA clone will 
result in the inoperability of the construct. Hence, the number of nucleotides 
between the promoter sequence and the cDNA clone is preferably not more than 
eight, more preferably not more than five, still more preferably not more than 
15 three, and most preferably not more than one. 

Examples of promoters which are useful in the cDNA sequences of 
the present invention include, but are not limited to T3 promoters, T7 promoters, 
cytomegalovirus (CMV) promoters, and SP6 promoters. The DNA sequence of 
the present invention may reside in any suitable transcription vector. The DNA 

20 sequence preferably has a complementary DNA sequence bound thereto so that the 
double-stranded sequence will serve as an active template for RNA polymerase. 
The transcription vector preferably comprises a plasmid. When the DNA sequence 
comprises a plasmid, it is preferred that a unique restriction site be provided 3' 
(with respect to the virion RNA sequence) to the cDNA clone. This provides a 

25 means for linearizing the DNA sequence to allow the transcription of genome- 
length RNA in vitro. 



The cDNA clones can be generated by any of a variety of suitable 
methods known to those skilled in the art. A preferred method is the method set 
forth in United States Patent No. 5,185,440 to Davis et al. , the disclosure of which 
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is incorporated in its entirety by reference, and Gubler et al. , Gene 25:263 (1983). 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized intraceUularly after introduction of the cDNA. 

The C-dwood S.A. and TR339 cDNA clones and the infectious 
RNAs and infectious virus particles produced therefrom of the present invention 
are useful for the preparation of pharmaceutical formulations, such as vaccines 
In addition, the cDNA clones, infectious RNAs, and infectious viral particles of 
the present invention are useful for administration to animals for the purpose of 
producing antibodies to the Girdwood S.A. virus or the Sindbis virus strain 
TR339, which antibodies may be collected and used in known diagnostic 
techniques for the detection of Girdwood S.A. virus or Sindbis virus strain TR339. 
Antibodies can also be generated to the viral proteins expressed from the cDNAs 
disclosed herein. As another aspect of the present invention, the claimed cDNA 
clones are useful as nucleotide probes to detect the presence of Girdwood S.A. or 
TR339 genomic RNA or transcripts. 

m. Infectious Virus Particles an d PharmarPnti^ i Formulae 

The infectious virus particles of the present invention include those 
containing double promoter vectors and those containing replicon vectors as 
described hereinabove. Alternately, theinfectious virus particles contain infectious 
RNAs encoding the Girdwood S.A. or TR339 genome. When the infectious RNA 
comprises the Girdwood S.A. genome, preferably the RNA has the sequence 
encoded by the cDNA given as SEQ ID NO:4. When the infectious RNA 
comprises the TR339 genome, preferably the RNA has the sequence encoded by 
the cDNA given as SEQ ID NO:8. 

The infectious, alphavirus particles of the present invention may be 
prepared according to the methods disclosed herein in combination with techniques 
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known to those skilled in the an. These methods include transfecting an 
alphavirus-pennissive cell with a replicon RNA including the alphavirus packaging 
segment and an inserted heterologous RNA, a first helper RNA including RNA 
encoding at least one alphavirus structural protein, and a second helper RNA 
including RNA encoding at least one alphavirus structural protein which is 
different from that encoded by the first helper RNA. Alternately, and preferably, 
at least one of the helper RNAs is produced from a cDNA encoding the helper 
RNA and operably associated with an appropriate promoter, the cDNA being 
stably transfected and integrated into the cells. More preferably, all of the helper 
RNAs will be "launched" from stably transfected cDNAs. The step of transfecting 
the dphavirus-permissive cell can be carried out according to any suitable means 
known to those skilled in the art, as described above with respect to propagation- 
competent viruses. 

Uptake of propagation-competent RNA into the cells in vitro can be 
carried out according to any suitable means known to those skilled in the art. 
Uptake of RNA into the cells can be achieved, for example, by treating the cells 
with DEAE-dextran, treating the RNA with LIPOFECTIN® before addition to the 
cells, or by electroporation, with electroporation being the currently preferred 
means. These techniques are well known in the art. See e.g., United States 
Patent No. 5,185,440 to Davis et al., and PCT Publication No. WO 92/10578 to 
Bioption AB, the disclosures of which are incorporated herein by reference in their 
entirety. Uptake of propagation-competent RNA into the cell in vivo can be 
carried out by administering the infectious RNA to a subject as described in 
Section I above. 

The infectious RNAs may also contain a heterologous RNA 
segment, where the heterologous RNA segment contains a heterologous RNA and 
a promoter operably associated therewith. It is preferred that the infectious RNA 
introduces and expresses the heterologous RNA in bone marrow cells as described 
in Section I above. According to this embodiment, it is preferable that the 
promoter operatively associated with the heterologous RNA is operable in bone 
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marrow cells. The heterologous RNA may encode any protein or peptide, 
preferably an immunogenic protein or peptide, a therapeutic protein or peptide, a 
hormone, a growth factor, an interleukin, a cytokine, a chemokine, an enzyme, 
a ribozyme, or an antisense oligonucleotide as described in more detail in Section 
I above. 

The step of facilitating the production of the infectious viral.particles 
in the cells may be carried out using conventional techniques. See e.g., United 
States Patent No. 5,185,440 to Davis et al., PCT Publication No. WO 92/10578 
to Bioption AB, and United States Patent No. 4,650,764 to Temin et al. (although 
Temin et al. , relates to retroviruses rather than alphaviruses). . The infectious viral 
particles may be produced by standard cell culture growth techniques. 

The step of collecting the infectious virus particles may also be 
carried out using conventional techniques. For example, the infectious particles 
may be collected by cell lysis, or collection of the supernatant of the cell culture, 
as is known in the art. See e.g., United States Patent No. 5,185,440 to Davis et 
al., PCT Publication No. WO 92/10578 to Bioption AB, and United States Patent 
No. 4,650,764 to Temin et al. Other suitable techniques will be known to those 
skilled in the art. Optionally, the collected infectious virus particles may be 
purified if desired. Suitable purification techniques are well known to those skilled 
in the art. 

Pharmaceutical formulations, such as vaccines, of the present 
invention comprise an immunogenic amount of the infectious, virus particles in 
combination with a pharmaceutical^ acceptable carrier. An "immunogenic 
amount" is an amount of the infectious virus panicles which is sufficient to evoke 
an immune response in the subject to which the pharmaceutical formulation is 
administered. An amount of from about 10 1 to about 10 7 particles, and preferably 
about 10 4 to 10* particles per dose is believed suitable, depending upon the age and 
species of the subject being treated, and the immunogen against which the immune 
response is desired. 
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Pharmaceutical fonnulations of the present invention for therapeutic 
use comprise a therapeutic amount of the infectious vims particles in combination 
with a pharmaceutically acceptable carrier. A "therapeutic amount" is an amount 
of the infectious virus particles which is sufficient to produce a therapeutic effect 
5 (e.g. , triggering an immune response or supplying a protein to a subject in need 
thereof) in the subject to which the pharmaceutical formulation is aciininistered. 
The therapeutic amount will depend upon the age and species of the subject being 
treated, and the therapeutic protein or peptide being administered. Typical dosages 
are an amount from about 10 1 to about 10 s infectious units. 

10 Exemplary pharmaceutically acceptable carriers include, but are not 

limited to, sterile pyrogen-free water and sterile pyrogen-free physiological saline 
solution. Subjects which may be administered immunogenic amounts of the 
infectious vims particles of the present invention include but are not limited to 
human and animal (e.g., pigi cattle, dog, horse, donkey, mouse, hamster, 

15 monkeys) subjects. 

Pharmaceutical fonnulations of the present invention include those 
suitable for parenteral (e.g., subcutaneous, intracerebral, intradermal, 
intramuscular, intravenous and intraarticular) administration. Alternatively, 
pharmaceutical formulations of the present invention may be suitable for 
20 .administration to the mucus membranes of a subject (e.g. , intranasal aclministration 
by use of a dropper, swab, or inhaler). The formulations may be conveniently 
prepared in unit dosage form and may be prepared by any of the methods well 
known in the art. 



25 



The following examples are provided to illustrate the present 
invention, and should not be construed as limiting thereof. In these examples, 
PBS means phosphate buffered saline, EDTA means ethylene diamine tetraacetate, 
ml means milliliter, (il means microliter, mM means millimolar, fiM means 
micromolar, u means unit, PFU means plaque forming units, g means gram, mg 
means milligram, m means microgram, cpm means counts per minute, ic means 
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intracerebral or intracerebral^, ip means intraperitoneal or intraperitoneally, iv 
means intravenous or intravenously, and sc means subcutaneous or subcutaneously. 

Amino acid sequences disclosed herein are presented in the amino 
to carboxyl direction, from left to right. The amino and carboxyl groups are not 
5 presented in the sequence. Nucleotide sequences are presented herein by single 
strand only in the 5' to 3' direction, from left to right. Nucleotides and amino 
acids are represented herein in the manner recommended by the IUPAC-IUB 
Biochemical Nomenclature Commission, or (for amino acids) by either one letter 
or three letter code, in accordance with 37 CFR § 1.82^ *nd established usage. 
10 Where one letter amino acid code is used, the same sequence is also presented 
elsewhere in three letter code. 



EXAMPLE I 
Cells and Virus Stocks 
S. A. AR86 was isolated in 1954 from a pool of Culex sp. mosquitoes 

15 collected near Johannesburg, South Africa. Weinbren et al., S. Afr. Med. J. 30, 
631-36 (1956). Ockelbo82 was isolated from Culiseta sp. mosquitoes collected in 
Edsbyn, Sweden in 1982 and was associated serologically with human disease. 
Niklasson et al., Am. J. Trop. Med. Hyg. 33, 1212-17 (1984). Girdwood S.A. 
was isolated from a human patient in the Johannesburg area of South.Afnca in 

20 1963. Malherbe et al., S. Afr. Med. J. 37, 547-52 (1963). Molecularly cloned 
virus TR339 represents the deduced consensus sequence of Sindbis AR339. 
McKnight et al., /. Virol 70, 1981-89 (1996); William Klimstra, personal 
communication. TRSB is a laboratory strain of Sindbis isolate AR339 derived 
from a cDNA clone pTRSB and differing from the AR339 consensus sequence at 

25 three codons. McKnight et al., /. Virol 70, 1981-89 (1996). pTRSOOO is a full- 
length cDNA clone of Sindbis AR339 following the SP6 phage promoter and 
containing mostly Sindbis AR339 sequences. 

Stocks of all molecularly cloned viruses were prepared by 
electroporating genome length in vitro transcripts of their respective cDNA clones 
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in BHK-21 cells. Heidner et al., J. Virol. 68, 2683-92 (1994). Girdwood S.A. 
(Malherbe et aL, S. Afr. Med. J. 37, 547-52 (1963)) and Ockelbo82 (Espmarkand 
Niklasson, Am. J. Trop. Med. Hyg. 33, 1203-11 (1984); Niklasson et al., Anu J. 
Trop. Med. Hyg. 33, 1212-17 (1984)) were passed one to three times in BHK-21 
cells in order to produce amplified stocks of vims. All virus stocks were 
stored at -70°C until needed. The titers of the virus stocks were determined on 
BHK-21 cells from aliquots of frozen virus. 



EXAMPLE 2 

Cloning the S.A.AR 86 and Girdwood S.A. Genomic Spq »»m»p C 
10 The sequences of S.A.AR86 (Figure 1, SEQ ID NO: 1) and 

Girdwood S.A. (Figure 3, SEQ ID NO:4) were determined from uncloned reverse 
transcriptase-polymerase chain reaction (RT-PCR) fragments amplified from virion 
RNA. Heidner et al., J. Virol. 68, 2683-92 (1994). The sequence of the 5' 40 
nucleotides was determined by directly sequencing the genomic RNA. Sanger et 
15 al., Proc. Natl. Acad. Sci. USA 74, 5463-67 (1977); Zimmera and Kaesberg, 
Proc. Natl. Acad. Sci. USA 75, 4257-61 (1978); Ahlquist et al., Ce//23, 183-89 
(1981). 



The S. A. AR86 genome was 1 1 ,663 nucleotides in length, excluding 
the 5' CAP and 3'poly(A) tail, 40 nucleotides shorter than the alphavirus prototype 

20 Sindbis strain AR339. Strauss et al., Virology 133, 92-110 (1984). Compared 
with the consensus sequence of Sindbis virus AR339 (McKnight et al. t J. Virol. 
70 1981-89 (1996)), S.A. AR86 contained two separate 6-nucleotide insertions, and 
one 3-nucleotide insertion in the 3' half of the nsP3 gene, a region not well 
conserved among alphaviruses. The two 6-nucleotide insertions were found 

25 immediately 3' of nucleotides 5403 and 5450, and the 3-nucleotide insertion was 
immediately 3 ' of nucleotide 5546 compared with the AR339 genome. In addition, 
S.A.AR86 contained a 54-nucleotide deletion in nsP3 which spanned nucleotides 
5256 to 5311 of AR339. As a result of these deletions and insertions, S.A.AR86 
nsP3 was 13 amino acids smaller than AR339, containing an 18-amino acid 

30 deletion and a total of 5 amino acids inserted. The 3' untranslated region of 
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S.A.AR86 contained, with respect to AR339, two 1-nucleotide deletions at 
nucleotides 11,513 and 11,602, and one 1-nucleotide insertion following nucleotide 
11,664. The total numbers of nucleotides and predicted amino acids comprising 
the remaining genes of S.A.AR86 were identical to those of AR339. 

A notable feature of the deduced amino acid sequence of S.A.AR86 
(Figure 2, SEQ ID NO:2 and SEQ ID NO:3) was the cysteine codon in place of 
an opal termination codon between nsP3 and nsP4. S.A.AR86 is the only 
alphavirus of the Sindbis group, and one of just three alphavirus isolates sequenced 
to date, which do not contain an opal termination codon between nsP3 and nsP4. 
Takkinen, K. f Nucleic Acids Res. 14, 5667-5682 (1986); Strauss et al., Virology 
164, 265-74 (1988). 

The genome of Girdwood S.A. was 11,717 nucleotides long 
excluding the 5' CAP and 3' poly(A) tail. The nucleotide sequence (SEQ ID 
NO:4) of the Girdwood S.A. genome and the putative amino acid sequence (SEQ 
ID NO:5 and SEQ ID NO:6) of the Girdwood S.A. gene products are shown in 
Figure 3 and Figure 4, respectively. The asterisk at position 1902 in SEQ ID 
NO:5 indicates the position of the opal termination codon in the coding region of 
the nonstructural polyprotein. The extra nucleotides relative to AR339 were in the 
nonconserved half of nsP3, which contained insertions totalling 15 nucleotides, and 
in the 3' untranslated region which contained two 1-nucleotide deletions and a 1- 
nucleotide insertion with respect to the consensus Sindbis AR339 genome. The 
insertions found in the nsP3 gene of Girdwood S.A. were identical in position and 
content to those found in S.A.AR86, although Girdwood S.A. did not have the 
large nsP3 deletion characteristic of S.A.AR86. The remaining portions of the 
genome contained the same number of nucleotides and predicted amino acids as 
Sindbis AR339. 

Overall, Girdwood S.A. was 94.5% identical to the consensus 
Sindbis AR339 sequence, differing at 655 nucleotides not including the insertions 
and deletions. These nucleotide differences resulted in 88 predicted amino acid 
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changes or a difference of 2.3%. A plurality of amino acid differences were 
concentrated in the nsP3 gene, which contained 32 of the amino acid changes, 25 
of which were in the nonconserved 3' half. 



The Girdwood S.A. nucleotides at positions 1, 3, and 11,717 could 
not be resolved. Because the primer used during the RT-PCR amplification of the 
3' end of the genome assumed a cytosine in the 3' terminal position, the identity 
of this nucleotide could not be determined with certainty. However, in all 
alphaviruses sequenced to date there is a cytosine in this position. This, combined 
with the fact that no difficulty was encountered in obtaining RT-PCR product for 
this region with an oligo(dT) primer ending with a 3'G, suggested that Girdwood 
S.A. also contains a cytosine at this position. The ambiguity at nucleotide 
positions 1 and 3 resulted from strong stops encountered during the RNA 
sequencing. 



EXAMPLE 3 
Comparison of 5j!.A AR86 and Girdwood S,A. 
Sequences Wi th Other Sindbis-Relateri Virus Spq »Pnr»c 

Table 1 examines the relationship of S. A. AR86 and Girdwood S.A. 
to each other and to other Sindbis-related viruses. This was accomplished by 
aligning the nucleotide and .deduced amino acid sequences of Ockelbo82, AR339 
and Girdwood S.A. to those of S.A.AR86 and then calculating the percentage 
identity for each gene using the programs contained within the Wisconsin GCG 
package (Genetics Computer Group, 575 Science Drive, Madison WI 53711); as 
described in more detail in McKnight et al., J. Virol. 70, 1981-89 (1996). 

The analysis suggests that S.A.AR86 is most similar to the other 
South African isolate, Girdwood S.A. , and that the South African isolates are more 
similar to the "Swedish Ockelbo82 isolate than to the Egyptian Sindbis AR339 
isolate. These results also suggest that it is unlikely that S.A.AR86 is a 
recombinant virus like WEE virus. Hahn et al., Proc. Natl. Acad. Sci. USA 85, 
5997-6001 (1988). 
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EXAMPLE 4 
NeuroviruIenr<» nf S A.ARS6 anrl GirdwnnH S A 
Girdwood S.A. , Ockelbo82, and S.A.AR86 are related by sequence; 
in contrast, it has previously been reported that only S. A.AR86 displayed the adult 
mouse neurovirulence phenotype. Russell et al., J. Virol. 63, 1619-29 (1989). 
These findings were confirmed by the present investigations. Briefly, groups of 
four female CD-I mice (3-6 weeks of age) were inoculated ic with 10» plaque- 
forming units (PFU) of S.A.AR86, Girdwood S.A., or Ockelbo82. Neither 
Girdwood S.A. nor Ockelbo82 infection produced any clinical signs of infection. 
Infection with S.A.AR86 produced neurological signs within four to five days and 
ultimately killed 100% of the mice as previously demonstrated. 



Table 2 lists those amino acids of S.A.AR86 which might explain 
the neurovirulence phenotype in adult mice. A position was scored as potentially 
related to the S.A.AR86 adult neurovirulence phenotype if the S.A.AR86 amino 
acid differed from that which otherwise was absolutely conserved at that position 
in the other viruses. 



TABLE 2 

Divergent Amino Acids in S.A.AR86 
Potentially Related to the Adult Neurovirulence Phenotype 





Position in 


S.A.AR86 


Conserved 




S.A.AR86 


Amino Acid 


Amino Acid 


nsPI 


583 


Thr 


He 


nsP2 


256 


Arg 


Ala 




648 


lie 


Val 




651 


Lys 


Glu 


nsP3 


344 


Giy 


Glu 




386 


Tyr 


Ser 




441 


Asp 


Gly 




445 


lie 


Met 




537 


Cys 


Opal 


E2 


243 


Ser 


Leu 


6K 


30 


Val 


lie 


£1 


112 


Val 


Ala 




169 


Leu 


Ser 
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EXAMPLE 5 
PS55 Molecular Clone of S.A. ARSfi 
As a first step in investigating the unique adult mouse 
neurovirulence phenotype of S.A.AR86, a full-length cDNA clone of the 
5 S. A.AR86 genome was constructed. The sources of cDNA included conventional 
cDNA clones (Davis et al., Virology 171, 189-204 (1989)) as well as uncloned 
RT-PCR fragments derived from the S.A.AR86 genome. As described previously, 
these were substituted, starting at the 3' end, into pTR5000 (McKnight et al., 7. 
Virol 70, 1981-89 (1996)), a full-length Sindbis clone from which infectious 
10 genomic replicas could be derived by transcription with SP6 polymerase in vitro. 

The end result was pS55, a molecular clone of S.A.AR86 from 
which infectious transcripts could be produced and which contained four nucleotide 
changes (G for A at nt 215; G for C at nt 3863; G for A at nt 5984; and C for T 
at nt 9113) but no amino acid coding differences with respect to the S.A.AR86 
15 genomic RNA (amino acid sequence of S.A.AR86 presented in Figure 2 (SEQ ID 
NO:2 and SEQ ID NO:3)). The nucleotide sequence of clone pS55 is presented 
in Figure 5 (SEQ ID NO:7). 

As has been described by Simpson et al., Virology 222, 464-69 
(1996), neurovirulence and replication of the virus derived from pS55 (S55) were 

20 compared with those of S.A.AR86. It was found that S55 exhibits the distinctive 
adult neurovirulence characteristic of S.A.AR86. Like S.A.AR86, S55 produces 
100% mortality in adult mice infected with the virus and the survival times of 
animals infected 1 with both viruses were indistinguishable. In addition, S55 and 
S.A.AR86 were found to replicate to essentially equivalent titers in vivo, and the 

25 profiles of S55 and S.A.AR86 virus growth in the central nervous system and 
periphery were very similar. 

From these data it was concluded that the silent changes found in 
virus derived from clone pS55 had little or no effect on its growth or virulence, 
and that this molecularly cloned virus accurately represents the biological isolate, 
30 S.A.AR86. 
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EXAMPLE 6 
Constructi on of the Consensus AR339 Vims TR339 
The consensus sequence of the Sindbis vims AR339 isolate, the 
prototype alphavirus was deduced. The consensus AR339 sequence was inferred 
5 by comparison of the TRSB sequence (a laboratory-derived AR339 strain) with the 
complete or partial sequences of HR,,, (the Gen Bank sequence; Strauss et al., 
Virology 133, 92-110 (1984)), SV1A, and NSV (AR339-derived laboratory strains; 
Lustig et al., /. Virol 62, 2329-36 (1988)), and SIN (a laboratory-derived AR339 
strain; Davis et ah, Virology 161, 101-108 (1987), Strauss et al., /. ViroL 65, 
10 4654-64 (1991)). Each of these viruses was descended from AR339. Where these 
sequences differed from each other, they also were compared with the amino acid 
sequences of other viruses related to Sindbis virus: Ockelbo82, S.A.AR86, 
Girdwood S. A., and the somewhat more distantly related Aura virus. Rumenapf 
et al. , Virology 208, 621-33 (1995). 

15 The details of determining a consensus AR339 sequence and 

constructing the consensus virus TR339 have been described elsewhere. McKnight 

et al., /. Virol 70, 1981-89 (1996); Klimstra et al., manuscript in preparation. 

The nucleotide (SEQ ID NO:8) sequence of pTR339 is presented in Figure 6. 

The deduced amino acid sequences of the pTR339 non-structural and structural 
20 polyproteins are shown as SEQ ID NO:9 and SEQ ID NO:10, respectively. The 

asterisk at position 1897 in SEQ ED NO:9 indicates the position of the opal 

"termination codon in the coding region of the nonstructural polyprotein. The 

consensus nucleotide sequence diverged from the pTRSB sequence at three coding 

positions (nsP3 528, E2 1, and El 72). These differences are illustrated in Table 
25 3. 

TABLE 3 

i 

Amino Acid Differences Between 
Laboratory Strain TRSB and Molecular Clone TR339 





nsP3 52B (nt5683) 


E2 1 (nt8633) 


E1 72 (nt10279) 


TR339 


Arg (CGA) 


Ser (AGO 


Ala (GCU) 


TRSB 


Gin (CAA) 


Arg (AGA) 


Val (GUU) 
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EXAMPLE 7 
Animals Used for In Vivo Localization Studies 
Specific pathogen free CD-I mice were obtained from Charles River 
Breeding Laboratories (Raleigh, North Carolina) at 21 days of age and maintained 
under barrier conditions until approximately 37 days of age. Intracerebral (ic) 
inoculations were performed as previously described, Simpson et al., Virol. 222, 
464-49 (1996), with 500 PFU of S51 (an attenuated mutant of S55) or 10 3 PFU of 
S55. Animals inoculated peripherally were first anesthetized with METOFANE®. 
Then, 25 pX of diluent (PBS, pH 7.2, 1% donor calf serum, 100 u/ml penicillin, 
50 /xg/ml streptomycin, 0.9 mM CaCl 2 , and 0.5 mM MgCy containing 10 3 PFU 
of virus were injected either intravenously (iv) into the tan vein, subcutaneously 
(sc) into the skin above the shoulder blades on the middle of the back, or 
intraperitoneally (ip) in the lower right abdomen. Animals were sacrificed at 
various times post-inoculation as previously described. Simpson et al. , Virol. 222, 
464-49 (1996). Brains (including brainstems) were homogenized in diluent to 30% 
w/v, and right quadriceps were homogenized in diluent to 25 % w/v. Homogenates 
were handled and titered as described previously. Simpson et al., Virol. 222, 464- 
49 (1996). Bone marrow was harvested by crushing both femurs from each animal 
in sufficient diluent to produce a 30% w/v suspension (calculated as weight of 
uncrushed femurs in volume of diluent). Samples were stored at -70°C. For 
titration, samples were thawed and clarified by centrifugation at 1,000 x g for 20 
minutes at 4°C before being titered by conventional plaque assay on BHK-21 cells. 

EXAMPLE 8 
Tissue Preparation for In Situ Hybridization Studies 
Animals were anesthetized by ip injection of 0.5 ml AVERTIN® at 
various times post-inoculation followed by perfusion with 60 to 75 ml of 4% 
paraformaldehyde in PBS (pH 7.2) at a flow rate of 10 ml per minute. The entire 
carcass was decalcified for 8 to 10 weeks in 4% parafomaldehyde containing 8% 
EDTA in PBS (pH 6.8) at 4°C. This solution was changed twice during the 
decalcification period. Selected tissues were cut into blocks approximately 3 mm 
thick and placed into biopsy cassettes for paraffin embedding and sectioning. 
Blocks were embedded, sectioned and hematoxylin/eosin stained by Experimental 
Pathology Laboratories (Research Triangle Park, North Carolina) or North 
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Carolina State University Veterinary School Pathology Laboratory (Raleigh, North 
Carolina). 



EXAMPLE 9 
In Situ Hvbridizatin^ 
Hybridizations were performed using a [ 35 S]-UTP labeled S. A. AR86 
specific riboprobe derived from pDS-45. Clone pDS-45 was constructed by first 
amplifying a 707 base pair fragment from pS55 by PCR using primers 7241 (5'- 
CTGCGGCGGATTCATCTTGC-3', SEQ ID NO:ll) and SC-3 (5'- 
CTCCAACTTAAGTG-3 ' , SEQ ID NO:12). The resulting 707 base pair fragment 
was purified using a GENE CLEAN® kit (BiolOl, CA), digested with Hhal f and 
cloned into the Smal site of pSP72 (Promega). Linearizing pDS-45 with EcoRV 
and performing an in vitro transcription reaction with SP6 DNA-dependent, RNA 
polymerase (Promega) in the presence of p 5 S]-UTP resulted in a riboprobe 
approximately 500 nucleotides in length of which 445 nucleotides were 
complementary to the S.A.AR86 genome (nucleotides 7371 through 7816). A 
riboprobe specific for the influenza strain PR-8 hemagglutinin (HA) gene was used 
as a control probe to test non-specific binding. The in situ hybridizations were 
performed as described previously (Charles etal., Virol 208, 662-71 (1995)) using 
10 5 cpm of probe per slide. 



EXAMPLE 10 
Replication of S.A.AR86 in Bone Marrow 
Three groups of six adult mice each were inoculated peripherally 
(sc, ip, or iv) with 1200 PFU of S55 (a molecular clone of S.A.AR86) in 25 pi 
of diluent. Under these conditions, the infection produced no morbidity or 
mortality. Two mice from each group were anesthetized and sacrificed at 2, 4 and 
6 days post-inoculation by exsanguination. The serum, brain (including 
brainstem), right quadricep, and both femurs were harvested and titered by plaque 
assay. Virus was never detected in the quadricep samples of animals inoculated 
sc (Table 4). A single animal inoculated ip (two days post-inoculation) and two 
mice inoculated iv (at four and six days post-inoculation) had detectable virus in 
the right quadricep, but the titer was at or just above the limit of detection (6.25 
PFU/e tissue). Virus was present sporadically or at low levels in the brain and 
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serum of animals regardless of the route of inoculation. Virus was detected in the 
bone marrow of animals regardless of the route of inoculation. However, the 
presence of virus in bone marrow of animals inoculated sc or ip was more sporadic 
than animals inoculated iv, where five out of six animals had detectable vims. 
5 These results suggest that S55 targets to the bone marrow, especially following iv 
inoculation. 

The level and frequency of virus detected in the serum and muscle 
suggested that virus detected in the bone marrow was not residual virus 
contamination from blood or connective tissue remaining in bone marrow samples. 

10 The following experiment also suggested that virus in bone marrow was not due 
to tissue or serum contamination. Mice were inoculated ic with 1200 PFU of S55 
in 25 /xl of diluent. Animals were sacrificed at 0.25, 0.5, 1, 1.5, 2, 3, 4, 5, and 
6 days post-inoculation, and the carcasses were decalcified as described in 
Example 8. Coronal sections taken at approximately 3 mm intervals through the 

15 head, spine (including shoulder area), and hips were probed with an S55-specific 
[ 35 S]-UTP labeled riboprobe derived from pDS-45. Positive in situ hybridization 
signal was detected by one day post-inoculation in the bone marrow of the skull 
(data not shown). Weak signal also was present in some of the chondrocytes of 
the vertebrae, suggesting that S55 was replicating in these cells as well. Although 

20 the frequency of positive bone marrow cells was low, the signal was very intense 
over individual positive cells. This result strongly suggests that S55 replicates in 
vivo in a subset of cells contained in the bone marrow. 

EXAMPLE 11 
Other Sindbis Group Viruses 
25 It was of interest to determine if the ability to replicate in the bone 

marrow of mice was unique to S55 or was a general feature of other viruses, both 
Sindbis and non-Sindbis viruses, in the Sindbis group. Six 38-day-old female CD- 
1 mice were inoculated iv with 25 /*1 of diluent containing 10 3 PFU of S55, 
Ockelbo82, Girdwood S.A., TR339, or TRSB. At 2, 4 and 6 days post- 
30 inoculation two mice from each group were sacrificed and whole blood, serum, 
brain (including brainstem), right quadricep, and both femurs were harvested for 
virus titration. 
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The results of this experiment were similar to those with S55. 
TRSB infected animals had no vims detectable in serum or whole blood in any 
animal at any time, and with the other viruses tested, no virus was detected in the 
serum or whole blood of any animal beyond two days post-inoculation (detection 
limit, 25 PFU/ml). Neither TRSB nor TR339 was detectable in the brains of 
infected animals at any time post-inoculation. S55, Girdwood S.A., and 
Ockelbo82 were present in the brains of infected animals sporadically with the 
titers being at or near the 75 PFU/g level of detection. All the tested viruses were 
found sporadically at or slightly above the 50 PFU/g detection limit in the right 
quadricep of infected animals except for a single animal four days post-inoculation 
with TRSB which had nearly 10 s PFU/g of virus in its quadricep. 

The frequency at which the different viruses were detected in bone 
marrow varied widely, with S55 and Girdwood S.A. being the most frequently 
isolated (five out of six animals) and Ockelbo82 and TRSB being the least 
frequently isolated from bone marrow (one out of six animals and two out of six 
animals, respectively) (Table 4). Girdwood S.A. and S55 gave nearly identical 
profiles in all tissues. Girdwood S.A., unlike S.A.AR86, is not neurovirulent in 
adult mice (Example 4), suggesting that the adult neuro virulence phenotype is 
distinct from the ability of the virus to replicate efficiently in bone marrow. 
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EXAMPLE 12 
Virus Persistence in Bone Marrow 
The next step in our investigations was to evaluate the possibility 
that S.A.AR86 persisted long-term in bone marrow. S51 is a molecularly cloned, 
attenuated mutant of S55. S51 differs from S55 by a threonine for isoleucine 
5 substitution at amino acid residue 538 of nsPl and is attenuated in adult mice 
inoculated intracerebrally. Like S55, S51 targeted to and replicated in the bone 
mairow of 37-day-old female CD-I mice following ic inoculation. Mice were 
inoculated ic with 500 PFU of S51 and sacrificed at 4, 8, 16, and 30 days post- 
inoculation for determination of bone marrow and serum titers. At no time post- 
10 inoculation was virus detected in the serum above the 6.25 PFU/ml detection limit. 
Virus was detectable in the bone marrow samples of both animals sampled at four 
days post-inoculation and in one animal eight days post-inoculation (Table 5). No 
virus was detectable by titration on BHK-21 cells in any of the bone marrow 
samples beyond eight days post-inoculation. These results suggested that the 
15 attenuating mutation present in S51 , which reduces the neurovirulence of the virus, 
did not impair acute viral replication in the bone marrow. 

It was notable that the plaque size on BHK-21 cells of virus 
recovered on day 4 post-inoculation was smaller than the size of plaques produced 
by the inoculum virus, and that plaques produced from virus recovered from the 
20 'day 8 post-inoculation samples were even smaller and barely visible. This 
suggests a strong selective pressure in the bone marrow for virus that is much less 
efficient 'in forming plaques on BHK-21 cells. 



To demonstrate that S51 virus genomes were present in bone 
marrow cells long after acute infection, four to six-week-old female CD-I mice 

25 were inoculated ic with 500 PFU of S51. Three months post-inoculation two 
animals were sacrificed, perfused with paraformaldehyde and decalcified as 
described in Example 8. The heads and hind limbs from these animals were 
paraffin embedded, sectioned, and probed with a S.A.AR86 specific p 5 S]-UTP 
labeled riboprobe derived from clone pDS-45. In situ hybridization signal was 

30 clearly present in discrete cells of the bone and bone marrow of the legs (data not 
shown). Furthermore, no in situ hybridization signal was detected in an adjacent 
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control section probed with an influenza virus HA gene specific riboprobe. As the 
relative sensitivity of in situ hybridization is reduced in decalcified tissues (Peter 
Charles, personal communication), these cells likely contain a relatively high 
number of viral sequences, even at three months post-inoculation. No in situ 
5 hybridization signal was observed in mid-sagital sections of the heads with the 
S.A.AR86 specific probe, although focal lesions were observed in the brain 
indicative of the prior acute infection with S51. 



TABLE 5 



S51 Titers in 


Bone Marrow Following IC Inoculation nf *nn ptttt 


Days Post- 
Inoculation 


Titers (Total PFU/Animall 


Limit of 
Detection 


Animal A 


Animal B 


4 


2100 


380 


62.5 


8 


62.5 


N.D.° 


62.5 


16 


N.D. 


N.D. 


62.5 


30 


N.D. 


N.D. 


62.5 



0 "N.D." indicates that the virus titers were below the limit of detection. 
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Example 13 

Replication of S.A.A.RSfi within BnrWJoint Tissue of Adult Mire 

Several old world alphaviruses, including Ross River Virus, 
Chikungunya virus, Okelbo82, and S.A.AR86 are associated with acute and persistent 
5 arthritis/arthralgia in humans. Molecular clones of several Sindbis group viruses, 
including S.A.AR86,.were used to investigate alphavirus replication within bone/joint 
tissue. 



Following intravenous inoculation of S.A.AR86 into adult CD-I mice, 
viral replication was observed in bone/joint tissue, but not surrounding muscle tissue of 

10 the bind limbs. Infectious virus was detectable 24 hrs post-infection; however, viral 
titer within bone/joint tissue was maximal 72 hours post-infection. Fractionation of 
hind limbs from infected animals revealed that the hip and knee joints were the 
predominant sites of viral replication. Replication within bone/joint tissue appears to be 
a common trait of Sindbis-group viruses, since the laboratory strains TR339 and TRSB 

15 also replicated within bone/joint tissue. In situ hybridization and S.A.AR86 based 
double promoter vectors expressing green fluorescent protein were used to further 
localize S.A.AR86 infected cells within bone/joint tissue. Green fluorescent protein 
expression was detected in bone/joint tissue for at least one month post-inoculation. 
These studies demonstrated that cells within the endosteum of synovial joints were the 

20 predominant site of S.AAR86 replication. 
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THAT WHICH IS CLAIMED IS: 

1 . A method of introducing and expressing heterologous RNA 
in bone marrow cells, comprising: 

(a) providing a recombinant alphavirus, said alphavirus 
containing a heterologous RNA segment, said heterologous RNA segment 
comprising a promoter operable in said bone marrow cells operatively associated 
with a heterologous RNA to be expressed in said bone marrow cells; and then 

(b) contacting said recombinant alphavirus to said bone marrow 
cells so that said heterologous RNA segment is introduced and expressed therein. 

2. A method according to claim 1 , wherein said contacting step 
is carried out in vitro. 

3 . A method according to claim 1 , wherein said contacting step 
is carried out in vivo in a subject in need of such treatment. 

4. A method according to claim 1, wherein said heterologous 
RNA encodes a protein or peptide. 

5. A method according to claim 1, wherein said heterologous 
RNA encodes an immunogenic protein or peptide. 

6. A method according to claim 1, wherein said heterologous 
RNA encodes an antisense oligonucleotide or a ribozyme. 

7. A method according to claim 1, wherein said alphavirus is 
an Old World alphavirus. 

8. A method according to claim 1, wherein said alphavirus is 
selected from the group consisting of SF group and SIN group alphaviruses. 
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9. A method according to claim 1, wherein said alphavirus is 
selected from the group consisting of Semliki Forest virus, Middelburg virus, 
Chikungunya virus, O'Nyong-Nyong virus, Ross River virus, Bannah Forest 
virus, Getah vims, Sagiyama virus, Bebaru virus, Mayaro virus, Una virus, 
Sindbis virus, South African Arbovirus No. 86, Ockelbo virus, Girdwood S.A. 
virus, Aura virus, Whataroa virus, Babanki virus, and Kyzylagach virus. 

10. A method according to claim 1, wherein said alphavirus is 
South African Arbovirus No. 86. 



11. A method according to claim 1, wherein said alphavirus is 
10 Girdwood S.A. 

12. A method according to claim 1, wherein said alphavirus is 
Sindbis strain TR339. 

13. A helper cell for expressing an infectious, propagation 
defective, Girdwood S.A. virus particle, comprising, in a Girdwood S.A.- 

15 permissive cell: 

(a) a first helper RNA encoding (i) at least one Girdwood S.A. 
structural protein, and (ii) not encoding at least one other Girdwood S.A. structural 
protein; and 

(b) a second helper RNA separate from said first helper RNA, 
20 said second helper RNA (i) not encoding said at least one Girdwood S.A.. 

structural protein encoded by said first helper RNA, and (ii) encoding said at least 
one other Girdwood S.A. structural protein not encoded by said first helper RNA, 
and with all of said Girdwood S.A. structural proteins encoded by said first and 
second helper RNAs assembling together into Girdwood S.A. particles in said cell 
25 containing said replicon RNA; 

and wherein the Girdwood S.A. packaging segment is deleted from 
at least said first helper RNA. 
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14. The helper cell according to claim 13, further containing a 

replicon RNA; 

said replicon RNA encoding said Girdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherein said Girdwood S.A. packaging segment is deleted from at 
least one of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
second helper RNA are all separate molecules from one another. 

15. The helper cell according to claim 13, further containing a 

replicon RNA; 

said replicon RNA encoding said Girdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherein said replicon RNA and said first helper RNA are separate 

molecules; 

and wherein the molecule containing said replicon RNA further 
contains RNA encoding said at least one Girdwood S.A. structural protein not 
encoded by said first, helper RNA. 

16. The helper cell according to claim 13, wherein said first 
helper RNA encodes both the Girdwood S.A. El glycoprotein and the Girdwood 
S.A. E2 glycoprotein, and wherein said second helper RNA encodes the Girdwood 
S.A. capsid protein. 

17. A method of making infectious, propagation defective, 
Girdwood S.A. virus particles, comprising: 

transfecting a Girdwood S.A.-permissive cell according to claim 13 
with a propagation defective replicon RNA, said replicon RNA including said 
Girdwood S.A. packaging segment and an inserted heterologous RNA; 

producing said Girdwood S.A. virus particles in said transfected 

cell; and then 

collecting said Girdwood S.A. virus particles from said cell. 



10 



15 
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18. Infectious Gird wood S.A. viras panicles produced by the 
method of Claim 17. 

19. Infectious Girdwood S.A. virus particles ronLaining a 
replicon RNA encoding a promoter, an inserted heterologous RNA, and wherein 
RNA encoding at least one Girdwood S.A. structural protein is deleted therefrom 
so that said Girdwood S.A. virus particle is propagation defective. 

20. A pharmaceutical formulation comprising infectious 
Girdwood S.A. virus particles according to claim 18 or 19 in a pharmaceutically 
acceptable carrier. 

21. A helper cell for expressing an infectious, propagation 
defective, TR339 virus particle, comprising, in a TR339-pennissive cell: 

(a) a first helper RNA encoding (i) at least one TR339 structural 
protein, and (ii) not encoding at least one other TR339 structural protein; and 

(b) a second helper RNA separate from said first helper RNA, 
said second helper RNA (i) not encoding said at least one TR339 structural protein 
encoded by said first helper RNA, and (ii) encoding said at least one other TR339 
structural protein not encoded by said first helper RNA, and with all of said 
TR339 structural proteins encoded by said first and second helper RNAs 
assembling together into TR339 particles in said cell containing said replicon 
RNA; 

and wherein the TR339 packaging segment is deleted from at least 
said first helper RNA. 

22. The helper cell according to claim 21, further containing a 

replicon RNA; 

said replicon RNA encoding said TR339 packaging segment and an 
inserted heterologous RNA; 

wherein said TR339 packaging segment is deleted from at least one 
of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
second helper RNA are all separate molecules from one another. 
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23. The helper cell according to claim 21, further containing a 

replicon RNA; 

said replicon RNA encoding said TR339 packaging segment and an 
inserted heterologous RNA; 

wherein said replicon RNA and said first helper RNA are separate 

molecules; 

and wherein the molecule containing said replicon RNA further 
contains RNA encoding said at least one TR339 structural protein not encoded by 
said first helper RNA. 

24. The helper cell according to claim 21, wherein said first 
helper RNA encodes both the TR339 El glycoprotein and the TR339 E2 
glycoprotein, and wherein said second helper RNA encodes the TR339 capsid 
protein. 

25. A method of making infectious, propagation defective, 
TR339 virus particles, comprising: 

transfecting a TR339-permissive cell according to claim 21 with a 
propagation defective replicon RNA, said replicon RNA including said TR339 
packaging segment and an inserted heterologous RNA; 

producing said TR339 virus particles in said transfected cell; and 

then 

collecting said TR339 virus particles from said cell. 

26. Infectious TR339 virus particles produced by the method of 

Claim 25. 

27. Infectious TR339 virus panicles containing a replicon RNA 
encoding a promoter, an inserted heterologous RNA, and wherein RNA encoding 
at least one TR339 structural protein is deleted therefrom so that said virus particle 
is propagation defective. 

28. A pharmaceutical formulation comprising infectious TR339 
virus panicles according to Claim 26 or 27 in a pharmaceutically acceptable carrier. 
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29. A recombinant DNA comprising a cDNA coding for an 
infectious Girdwood S.A. vims RNA transcript and a heterologous promoter 
positioned upstream from said cDNA and operatively associated therewith. 

30. An infectious RNA transcript encoded by a cDNA according 

to claim 29. 



31. An infectious RNA according to claim 30, said infectious 
Girdwood S.A. RNA transcript containing a heterologous RNA segment, said 
heterologous RNA segment comprising a promoter operably associated with a 
heterologous RNA. 

32. Infectious viral particles containing an RNA transcript 
according to claim 30. 

33. A recombinant DNA comprising a cDNA coding for a 
Sindbis strain TR339 RNA transcript and a heterologous promoter positioned 
upstream from said cDNA and operatively associated therewith. 

34. An infectious RNA transcript encoded by a cDNA according 

to claim 33. 

35. An infectious RNA according to claim 34, said infectious 
Girdwood S.A. RNA transcript containing a heterologous RNA segment, said 
heterologous RNA segment comprising a promoter operably associated with a 
heterologous RNA. 

36. Infectious viral particles containing an RNA transcript 
according to claim 34. 
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Nucleotide Sequence of S.A.AR86 



I ATTCCCCCCQ TAGTACACAC tattoaatca aacaccccac Caattccact accatcacaa tccacaaocc actacttaac otacacgtag accctcagao 

101 TCCCTTTCTC CTCCAACTCC AAAACACCTT CCCOCAATTT GAGGTAGTAO CACAGCAGGT CACTCCAAAT GACCATCCTA ATCCCAOAGC Al 1 i ILL CAT 
Ml CrCCCCACTA AACTAATCCA CCTCCACCTT CCTACCACAO CGACCATTTT CCACATAGCC ACCGCACCGC CTCCTAGAAT GTTTTCCGAG CACCAOTACC 
301 ATTGCGTTTG CCCCATGCGT AGTCCAGAAO ACCCGGACCG CATGATGAAA TATGCCAGCA AACTGGCGGA AAAAGCATGT AAOATTACAA ACAAGAACTT 
401 GCATOAGAAG ATCAAGGACC TCCGGACCGT ACTTGATACA CCGGATCCTG AAACGCCATC ACTCTGCTTC CACAACGATG TTACCTCCAA CACGCGTCCC 
501 GAGTACTCCG TCATOCAGGA CGTGTACATC AACGCTCCCG GAACTATTTA CCACCAGGCT ATGAAAGGCG TGCGGACCCT CTACTGGATT CGCTPCOACA 
601 CCACCCAGTT CATCTTCTCO GCTATGGCAG GTTCGTACCC TGCATACAAC ACCAACTCGG CCCACGAAAA AGTCCTTGAA CCGCGTAACA TCCGACTCTG 
701 CAGCACAAAG CTGAGTGAAC GCAGGACAGG AAAGTTGTCG ATAATGAGGA AGAAGGAGTT GAAGCCCGGG TCACGGGTTT ATTTCTCCGT TGGATCGACA 
101 CTTTACCCAO AACACAGAGC CAGCTTGCAG AGCTGGCATC TTCCATCGGT GTTCCACTTG AAAGGAAAGC AGTCGTACAC TTGCCGCTGT GATACAGTGG 
901 TGAGCTGCGA AGGCTACGTA GTGAAGAAAA TCACCATCAG TCCCCGGATC ACGGGAGAAA CCGTCGGATA CGCGGTTACA AACAATAGCG AGGGCTTCTT 
1001 CCTATGCAAA CTTACCCATA CAGTAAAAGG AGAACGCGTA tcgttccccg TGTGCACCTA TATCCCCCCC ACCATATGCG ATCAGATCAC CCGCATAATG 
1 101 OCCACGGATA TCTCACCTGA CGATGCACAA AAACTTCTGG TTCGGCTCAA CCAGCGAATC GTCATTAACG GTAAGACTAA CAGGAACACC AATACCATGC 
1201 AAAATTACCT TCTGCCAATC ATTGCACAAG GGTTCAGCAA ATCGGCCAAG GAGCGCAAAG AAGATCTTGA CAATGAAAAA ATGCTGGCCA CCAGAOAGCG 
1301 CAAGCTTACA TATGGCTGCT TGTGCCCGTT TCGCACTAAG AAAGTGCACT CGTTCTATCG CCCACCTGGA ACGCACACCA TCGTAAAAGT CCCAGCCTCT 
1401 TTTAGCGCTT TCCCCATGTC ATCCGTATCO ACTACCTCTT TGCCCATGTC GCTGAGGCAG AAGATGAAAT TGGCATTACA ACCAAAGAAG GAGGAAAAAC 
1501 TCCTGCAAGT CCCGGAGGAA TTAGTTATGG AGGCCAAGGC TGCTTTCGAG GATGCTCAGG AGGAATCCAG AGCCGAGAAC CTCCGAGAAG cactcccacc 
1601 ATTAGTGGCA CACAAAGGTA TCGAGGCACC TCCGGAAGTT CTCTGCOAAG TGGAGGGGCT CCAGGCGGAC ACCGGAGCAG CACTCCTCGA AACCCCOCGC 
1701 GGTCATGTAA CGATAATACC TCAACCAAAT OACCCTATGA TCGGACAGTA TATCGTTGTC TCCCCGATCT CTGTCCTGAA GAACCCTAAA CTCGCACCAO 
I SOI CACACCCGCT AGCAGACCAG GTTAAGATCA TAACGCACTC CGGAAGATCA GGAAGGTATG CAGTCGAACC ATACGACGCT AAAGTACTGA TGCCAGCAGO 
1901 AAGTGCCGTA CCATCGCCAG AATTCTTACC ACTGAGTGAG AGCGCCACGC TTGTGTACAA CGAAAGAGAG TTTGTGAACC GCAAGCTGTA CCATATTOCC 
2001 ATGCACGGTC CCGCTAAGAA TACAGAAGAQ GAGCAGTACA AGGTTACAAA GGCAGAGCTC GCAGAAACAG AGTACGTGTT TGACGTGGAC AAGAAGCGAT 
2101 CCGTTAAGAA GGAAGAAGCC TCAGGACTTO TCCTTTCGOG AGAACTGACC AACCCGCCCT ATCACGAACT AGCTCTTGAG GGACTGAACA CTCGACCCGC 
HOI GGTCCCGTAC AAGGTTGAAA CAATAGGAGT GATAGGCACA CCAGGATCGG GCAACTCAGC TATCATCAAG TCAACTGTCA CGGCACGTGA TCTTGTTACC 
2301 AGCGGAAAGA AAGAAAACTG CCGCGAAATT GAGGCCGACG TCCTACGGCT GAGGGGCATG CAGATCACGT CGAAGACAGT GCATTCCGTT ATGCTCAACG 
2401 GATGCCACAA AGCCCTAGAA GTGCTGTATG TTGACGAACC GTTCCGGTGC CACCCAGGAG CaCTACTTGC CTTGATTGCA ATCGTCAGAC CCCGTAAGAA 
1501 GGT ACT ACTA TGCGGAGACC CTAAGCAATG CCGATTCTTC AACATGATGC AACTAAAGGT ACATTTCAAC CACCCTGAAA AAGACATATG TACCAAQACA 

:ai ttctacaagt ttatctcccg acgttgcaca cagccagtca cggctattgt atcgacactg cattaccAtc CAAAAATGAA AACCACAAAC CCGTGCAAGA 

2701 AGAACATCGA AATCGACATT ACAGGGGCCA CGAAGCCGAA GCCAGGGCAC ATCATCCTGA CATGTTTCCG CGGGTGGGTT AAGCAACTGC AAATCGACTA 
2101 TCCCGGACAT GAGGTAATGA CAGCCGCGGC ctcacaaggg CTAACCAGAA AAGGAGTATA TGCCGTCCGG CAAAAAGTCA ATGAAAACCC GCTGTACGCG 
2901 ATCACATCAG agcatgtgaa cgtgttgctc ACCCGCACTG AGGACAGGCT AGTATGGAAA ACTTTACAGG gcgacccatg GATTAAGCAG ctcactaacg 
3001 TACCTAAAGG AA A HI > L AG GCCACCATCG AGGACTGGGA AGCTGAACAC AAGGGAATAA TTGCTGCGAT AAACAGTCCC GCTCCCCGTA CCAATCCGTT 
3101 CAGCTGCAAG ACTAACGTTT GCTGGGCGAA AGCACTGCAA CCGATACTCG CCACGGCCGG TATCGTACTT ACCGGTTGCC AGTGGAGCGA GCTGTTCCCA 
3201 CAGTTTGCGG ATGACAAACC ACACTCGGCC ATCTACGCCT TAGACCTAAT TTGCATTAAG TTTTTCCGCA TGGACTTGAC AAGCGGCCTG TTTTCCAAAC 
3301 AGAGCATCCC GTTAACCTAC CATCCTGCCG actcagcgao GCCAGTAGCT CATTGGGACA ACAGCCCACG AACACCCAAG TATGGGTACG ATCACCCCGT 
3401 TGCCCCCGAA CTCTCCCGTA GATTTCCGGT GTTCCAGCTA GCTGGGAAAG CCACACAGCT TCaTTTGCAG ACGCCCAGAA CTAGAGTTAT CTCTGCACAG 
3501 CATAACTTGG tcccagtgaa ccgcaatctc CCTCACGCCT tagtccccga GCACAAGGAG AAACAACCCG GCCCGGTCGA AAAATTCTTG AGCCAGTTCA 
3601 AACACCACTC CCTACTTGTG ATCTCAGAGA AAAAAATTGA AGCTCCCCAC AAGAGAATCG AATGGATCGC CCCGATTGGC ATAGCCGCCG CAGATAAGAA 
3701 CTACAACCTG GCTTTCGGGT TTCCGCCGCA GGCACGGTAC GACCTGGTGT TCATCAATAT TGGAACTAAA TACAGAAACC ATCACTTTCA ACAGTGCGAA 
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3801 CACCACCCGO CCACCTTOAA AACCCTTTCG CCTTCCCCCC TGAACTCCCT TAACCCCGGA OCCACCCTCC TGGTGAAGTC CTACCCTTAC GCCOACCGCA 

3901 atactgagga cctagtcacc cctcttccca gaaaatttgt caoaotutct CCACCGACCC CAOAOTGCGT CTCAACCAAT ACAQAAATCT acctoatttt 
4001 CCCACAACTA OACAACACCC CCACACCaCA ATTCACCCCC CATCATTTOA ATTCTGTCAT TTCCTCCCTO TACOACGCTA CAAOACACOC aottcgaocc 
4101 CCACCCTCGT ACCCTACTAA AAGCCAGAAC ATTCCTOATT CTCAACACGA ACCAGTTOTC AATCCACCCA ATCCACTCCG CAOACCACGA GAAGGAOYCT 
4201 CCCGTCCCAT CTATAAACCT TOCCCCAACA C7TTTCACCCA TTCACCCACA CAOACACCTA CCCCAAAACT CACT C TCT C C CAAGCAAACA AAGTGATCCA 
4301 CCCGGTTGGC CCTGATTTCC GGAAACACCC AGAGGCAGAA GCCCTCAAAT TGCTGCAAAA CGCCTACCAT GCAGTGGCAG ACTTAGTAAA TGAACATAAT 
4401 ATCAAGTCTG TCCCCATCCC ACTGCTATCT ACAGGCATTT ACCCAGCCGG AAAAOACCGC CTTGAGOTAT CACTTAACTG CTTGACAACC GCGCTAGACA 
4 SOt CAACTCATGC GGACGTAACC ATCTACTGCC TGGATAAGAA GTGGAAGGAA AGAATCGACG CGGT GC T CCA ACTTAAGGAG TCTGTAACTG AGCTGAACGA 
4«l TGAGGATATG GAGATCCACG ACCACTTAGT ATGGATCCAT CCGGACAGTT CCCTGAAGGG AAGAAAGGGA TTCAGTACTA CAAAAGGAAA GTTGTATTCG 
4701 TACT7TGAAQ gcaccaaatt CCATCAAGCA GCAAAAGATA TGGCGGAGAT AAAGGTCCTG tttccaaatg accaggaaag CAACGAACAA ctgtot oc ct 
4801 ACATATTGGC GGAGACCATG GAAGCAATCC GCGAAAAATG CCCGGTCGAC CACAACCCGT CGTCTAGCCC GCCAAAAACG CTCCCGTOCC TCTGT A TGTA 
4901 TGCCATGACG CCAGAAAGGG TCCACAGACT CAGAAGCAAT AACGTCAAAO AAGTTACAGT ATGCTCCTCC ACCCCCCTTC CAAAGTACAA AATCAAGAAT 
5001 GTTCAGAACC TTCAOTCCAC AAAAGTAGTC CTGTTTAACC CGCATACCCC CCCATTCOTT CCCGCCCGTA AGTACATAOA AGCACCAOAA CAOCCTGCAG 
5101 CTCCGCCTGC ACACGCCGAG GAGGCCCCCO GAGTTCTAGC GACACCAACA CCACCTGCAG CTGATAACAC CTCGCTTGAT GTCACGGACA TCTCACTGGA 
5201 CATGGAAGAC AGTAGCGAAG GCTCACTCTT TTCGAGCTTT AGCCGATCGG ACAACTACCO AAGGCAGGTG GTCGTGGCTG ACGTCCATGC CGTCCAAGAO 
5301 CCTGCCCCTG TTCCACCGCC AAGGCTAAAG AAGATGGCCC GCCTGGCAGC GGCAAGAATG CAGGAAGAGC CAACTCCACC GGCAAGCACC AGCTCTGCGG 
5401 ACGAGTCCCT TCACCTTTCT TTTGATGGGG TATCTATATC CTTCGGATCC CTTTTCGACG GAGAGATGGC CCGCTTGGCA GCGGCACAAC CCCCGGCAAO 
5501 TACATGCCCT ACGGATGTGC CTATGTCTTT CGGATCGTTT TCCGACGGAG AGATTGAGGA GTTGAGCCGC AGAGTAACCG AGTCGGAGCC CGTCCTGTTT 
3601 GGGTCATTTO AACCGGGCGA AGTGAACTCA ATTATATCGT CCCGATCAGC CGTATCTTTT CCACCACGCA AGCAGAGACG TAGACGCAGG AGCAGCAGGA 
5701 CCGAATACTG TCTAACCGGG GTAGGTGGGT ACATATTTTC GACGCACACA GGCCCTGGGC ACTTGCAAAA GAAQTCCOTT CTGCAGAACC AGCTTACAOA 
5*01 ACCCACCTTG GAGCGCAATG TTCTGGAAAG AATCTACGCC CCGGTGCTCO ACACGTCOAA AGAGGAACAO CTCAAACTCA GGTACCACAT GATGCCCACC 
5901 GAAGCCAACA aaagcaggta CCAGTCTCCA AAAGTAGAAA ACCAGAAAGC cataaccact gagcgactgc tttcagggct ACGACTGTAT AACT C T GCC A 
6001 CAGATCAGCC AGAATCCTAT AAGATCACCT ACCCGAAACC ATCGTATTCC AGCAGTGTAC CAGCGAACTA CTCTGACCCA AAGTTTCCTG TAGCTGTTTG 
6101 TAACAACTAT CTGCATGAGA ATTACCCGAC GOTAGCATCT TATCAGATCA CCOACOAGTA CGATGCTTAC TTGGATATCG TAGACGCGAC AGTCCCTTGC 
6201 CTAGATACTG CAACTTTTTG CCCCCCCAAG CTTAGAAGTT ACCCGAAAAG ACACGAGTAT AGAGCCCCAA ACATCCGCAG TGCGGTTCCA tcagcoatgc 

6»i agaacacgtt gcaaaacgtg ctcattgccg cgactaaaao aaactgcaac GTCACACAAA tgcgtgaact gccaacactg gactcaocga catpcaacgt 
6401 tgaatgcttt cgaaaatatg catgcaatga ccagtattco gaggagtttg cccgaaagcc aattacgatc actactgagt tcgttaccgc atacgtgccc 
6501 agactgaaag gccctaaggc ccccgcactg ttcgcaaaga cgcataattt ggtcccattg caagaagtgc ctatggatag attcgtcato gacatgaaaa 
6601 gagacgtgaa agttacacct gccacgaaac acacagaaga aacaccgaaa gtacaagtca tacaagcccc agaacccctg gcgaccgctt acctatgcgg 
6701 GATCCaCCGG gacttagtgc gcaggcttac agcccttttg CTACCCAACA ttcacacgct ctttgacatg tcggcggagg actttgatgc aatcatagca 

6801 GAACACTTCA AGCAAGGTCA CCCGGTACTG GAGACGGATA TCCCCTCGTT CGACAAAAGC CAAGACGACG CTATGGCGTT > ACCCGCCTG ATGATCTTGG 
'6901 AAGACCTGGG TGTGGACCAA CCACTACTCG ACTTGATCCA CTGCCCCTTT GGAGAAATAT CATCCACCCA TCTGCCCACG CGTACCCGTT TCAAATTCGG 
7001 CGCCATGATG AAATCCCGAA TCTTXCTCAC CCTCTTTGTC AACACAGTTC TGAATGTCGT TATCGCCACC ACAGTATTCG AGGAGCCGCT TAAAACGTCC 
7101 AAATGTGCAG CATTTATCCG CGACGACAAC ATTATACACG GAGTAGTATC TGACAAAGAA ATGGCTGAGA GGTGTGCCAC ctggctcaac ATGGAGGTTA 
7201 AGATCATTGA CGCAGTCATC CGCGAGAGAC CACCTTACTT CTGCGGTGGA TTCATCTTGC AAGATTCGGT tacctccaca ccgtgtcgcg tggcccaccc 

7301 cttgaaaagg ctgtttaagt tgcgtaaacc gctcccagcc gaccatgagc aagacgaaga cagaagaccc gctctgctag atgaaacaaa ggcgtggttt 
7401 agagtagcta taacagacac cttagcagtg cccgtcgcaa ctccgtatoa ggtagacaac atcacacctc tcctgctggc attoagaact tttgcccaga 
7301 gcaaaacagc atttcaagcc atcagagggg aaataaagca tctctacggt cctcctaaat agtcagcata gtacatttca tctgactaat accacaacac 
7601 caccaccatg aatagaggat tctttaacat cctccccccc ccccccttcc cagcccccac tgccatotgo aggccgcgga gaaggaggca ggcggccccg 

7701 ATCCCTCCCC GCAATCGCCT CCCTTCCCAA ATCCAGCAAC TGACCACAGC CGTCAGTGCC CTAGTCATTO CACAGCCAAC TAGACCTCAA ACCCCACGCC 
7801 CACGCCCGCC GCCCCGCCAG AAGAAGCAGG CGCCAAAGCA ACCACCGAAG CCGAAGAAAC CAAAAACACA GGACAAGAAG AAGAAGCAAC CTGCAAAACC 



WO 98/36779 



3/12 



PCT/US98/02945 



7901 CAAACCCCCA AACAOACACC CTATCCCACT TAACTTCCAC CCCCACACAC TGTTCGACCT CAAAAATCAO CACCCACATO TCATCCCOCA CGCACTOGCC 
SO0I ATCGAAGGAA ACGTAATGAA ACCACTCCAC CTOAAACCAA CTATTCACCA CCCTCTOCTA TCAAACCTCA AATTCACCAA CTCGTCAGCA TACOACATCO 

1101 ACTTCCCACA cttccccctc aacatgagaa gtcacccctt cacctacacc agtgaacacc ctcaaccctt ctacaactcg caccaccoag cgctocaota 

8201 TACTCCACGC ACATTTACCA TCCCCCGOCO ACTACOACOC ACACOAOACA GTCCTCCTCC OATTATCCAT AACTCACCCC CCGTTCTCCC OATAGTCCTC 
8301 CCAGCCCCTC ATQACGCAAC AACAACCCCC CTTTCOCTCC TCACCTCCAA TACCAAACCO AACACAATCA ACACAACCCC CCAACCCACA CAACACTCCT 
WOI CTCCTGCACC ACTCCTCACC CCCATGTCCT TCCTTCCAAA CCTGAGCTTC CCATCCAATC CCCCCCCCAC ATCCTACACC CCCOAACCAT CCAGAGCItT 
ISO] CCACATCCTC GAACACAACC TOAACCACCA CCCCTACOAC A C CCI GC I LA ACCCCATATT CCCCTCCCCA TCCTCCCGCA CAACTAAAAG AACCCTCACT 
8«01 OACGACTTTA CCTTCACCAG CCCGTACTTO CCCACATGCT CCTACTCTCA CCATACTOAA CCCTGCTTTA CCCCCATTAA GATCGAGCAG GTCTGGOATO 
8701 AAGCGOACGA CAACACCATA CGGATACAGA CTPCCGCCCA GTTTGGATAC GACCAAAGCG GAGCAGCAAO CTCAAATAAG TACCCCTACA TGTCOCTCOA 
8801 GCACGATCAT ACTCTCAAAG AAGGCACCAT CCATGACATC AAGATCAGCA CCTCAGOACC GTCTACAACO CTTACCTACA AAGGATACTT TCTCCTCCCO 
1901 AAGTGTCCTC CAGGGGACAC CGTAACGGTT AGCATAGCGA GTAGCAACTC ACCAACGTCA TGCACAATCG CCCGCAAGAT AAAACCAAAA TTCGTCGGAC 
9001 GGGAAAAATA TGACCTACCT CCCGTTCACG GTAAGAAGAT TCCTTOCACA GTGTACGACC GTCTGAAA3A AACAACCCCC CGCTACATCA CTATOCACAG 
9101 GCCCGGACCO CATGCCTATA CATCCTATCT GGAGGAATCA TCAGOOAAAO TPTACGCGAA CCCACCATCC GGGAAGAACA TTACGTAC3A GTCCAAGTGC 
9201 GGCGATTACA AGACCGGAAC CGTTACCACC CGTACCOAAA TCACCGOCTO CACCGCCATC AAGCAGTGCG TCCCCTATAA CAGCGACCAA ACGAAGTCGG 
7301 TCTTCAACTC GCCGGACTCG ATCAOACACO CCGACCACAC GGCCCAAGGG AAATTGCATT TGCCTTTCAA GCTGATCCCG AGTACCTCCA TGCTCCCTCT 
9401 TGCCCACGCG CCGAACGTAG TACACGCCTT TAAACACATC AGCCTCCAAT TAGACACAGA CCATCTGACA TTGCTCACCA CCaCOAGACT AGGGGCAAAC 
9501 CCCGAACCAA ccactcaatc GATCATCGOA AACACCGTTA GAAACTTCAC CGTCGACCGA GATGGCCTGG aatacatato GGGCAATCAC gaaccaotaa 

9601 gggtctatcc ccaagagtct gcaccagoao accctcacgg atgcccacac gaaatagtac accattacta tcatcgccat cctgtgtaca ccatcttagc 
9701 cgtcgcatca gctgctgtgc cgatcatoat tcgcgtaact gttgcaccat tatgtgcctg taaaccgcgc cgtcagtgcc tgacgccata tgccctgccc 

9(01 CCAAATCCCC TCATTCCAAC TTCGCTGGCA CTTTTCTGCT GTGTTAGGTC GGCTAATGCT GAAACATTCA CCGACACCAT GaGTTACTTA TCGTCGAACA 
'9901 CCCAGCCCTT CTTCTCGCTC CAGCTGTGTA TACCTCTGCC CGCTCTCGTC GTTCTAATGC GCTGTTCCTC ATCCTGCCTG C HIN M AC TCGTTCCCGO 
10001 CGCCTACCTC GCGAAGGTAG ACGCCTACGA ACATGCGACC ACTG7TCCAA ATGTGCCACA GaTACCGTAT AAGGCACTTG TTGAAAGGCC AGGGTACCCC 
10,01 CCCCTCAA7T TGGAGATTAC TGTCATGTCC TCGOAGGTIT TCCCTTCCAC CAACCAACAG TACA1TACCT gcaaattcac CACTCTGCTC ccctcccci a 

10201 aagtcagatc ctgcggctcc ttcgaatgtc agcccgccoc tcacgcacac tatacctgca aggtctttgg aggggtgtac cccttcatgt ggggaggagc 

10301 ACAATCTTTT TGCGACAGTO AGAACAGCCA GATGAGTCAO GCGTACGTCG AATTGTCAGT AGATTGCGCO ACTGACCACG CCCAGGCGAT TAAGGTGCAT 
10401 ACTCCCGCGA TOAAAGTAGG ACTGCOTATA GTGTACGGGA acactaccag tttcctagat GTGTACGTGA ACCGAGTCAC accaggaacg tctaaagacc 
10501 TGAAAGTCAT acctgcacca atttcagcat tctttacacc attcgatcac aagctcgtta tcaatcgccc cctcgtgtac aactatgact ttccggaata 
10601 cggaccgatg aaaccagcag cgtttggaga cattcaagct acctccttoa ctagcaaaga cctcatcccc agcacagaca ttacgctact caacccttcc 

10701 CCCAAGAACC TCCATGTCCC GTACACGCAG GCCGCATCTG GATTCGAGAT GTGGAAAAAC AACTCACCCC GCCCACTCCA GCAAACCCCC CCmTGGGT 

10801 GCAACATTGC agtcaatccg cttcgagcgg TGGACTGCTC ATACGGGAAC attcccattt ctattgacat cccoaacgct gcctttatca ggacatcaga 
10901 tgcaccactg gtctcaacag tcaaatgtga tgtcagtgag tccacttatt cagcggactt cggagggatg gctaccctgc agtatgtatc cgaccgcc\a 
1 1001 ggacaatgcc ctgtacattc gcattcgagc acagcaaccc tccaagagtc cacagttcat ctcctggaga aaggagcggt gacagtacac ttcagcacco 
11101 cgagcccaca ggcgaacttc attgtatcgc tgtgtggtaa gaagacaaca TGCAATGCAG AATGCAAACC accacctgat catatcctga ccaccccgca 
I 1201 CAAAAATGAC caagaattcc aagcccccat ctcaaaaact tcatggagtt ggctgtttgc ccttttccgc ggcgcctcgt cgctattaat TATAGGACTT 

11301 ATGATTTTTG CTTGCAGCAT GATCCTCACT AGCACACCAA GATGACCGCT ACGCCCCAAT GACCCGACCA CCAAAACTCG ATGTACTTCC GAGGAACTGA 

11401 TCTCCATAAT GCATCAGCCT cgtatattao atcccccctt ACCCCGGCCA atatagcaac accaaaactc gacgtatttc cgagcaaccg cagtgcataa 

1 1301 TCCTCCCCAC TGTTGCCAAA TAATCACTAT ATTAACCATT TATTCAGCCG ACGCCAAAAC TCA ATGTATT TCTGAGGA AG CATGGTGCAT AATCCCATGC 
1 1 601 ACCGTCTCCA TAAL1 1 1 11 A TTAT7TCTTT TATTAATCAA CAAAATTTTC TTTTTAACAT TTC 
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Amino Acid Sequence of the Nonstructural Polyprotein 



l MEXTWNVDV DPQSPPWQL QKSFFQFEW AQQVTPNDHA NARAFSHLAS KUELEYFTT ATILOICJAP tRRMFSEHQY HCYCPMRSPB DPORMMJCYAS 

101 KLAEXACKIT NKNLHEXDCD LXTVLDTtTOA ETT3LCFHND VTCHTRAEYS VMQDVYIHAP GTTYHQAMKG VRTLYWIGFD TTQFMFSAMA CSYPAYKTNW 

201 ADEKVLEARN IGLCSTKUE GRTGKUIMR KCEUCFGSRV YFSVGnLYF EHRASLQSWH LPSYFHUCGK QSYTOtCDTV VSCEOYWKX mSPCTTCE 

301 TVOYAVTNNS EGFLUXVTD TVXCEHVSFP VCTYlFATtC DQMTGtMATD BPDDAQKLL VCLNQWVIN OCTMUmOM QNYLLFOAQ OFSKWAKERK 

401 EDLDNEKMLG TRERKLTTGC LWAFRTXKVH SFYRPPGTQT (VKVFASF3A FPMSSVWTTS LFMELRQKMK LAIQPKKEEX LLQVPEELVM EAKAAFBDAQ 

501 EESRAEXLRB ALPFLVADXG 1EAAAEWCB VtCUJADTOA ALVBTPRGHV R1IFQANDRM WQYTWJPt SVUCNAXLAP AHFLASQVXf mtSORSGRY 

601 AVEFYDAKVL MPAGSAVPWP EFLALSESAT tVYNEREFVM RKLYMAMHG PAJCNTEBEQY KVTKAEIAET EYVFDV0XXR CVKXEBASCL VLSGUTNPF 

701 YHELALEGLK TRPAVPYXVE TTGVtGTTGS GKSAtDCSTV TARDL VTSGX KENCREJEAD VLRLRGMQIT OCTVDSVMLM GCKXAVEVLY VDEAFRCHAG 

801 ALLAUAIVR PWOCWLCCD MCQCCFFNMM QLXVHFNKPE KDICTXTFYK FtSRRCTQFV TAIVSTLHYD GKMKTTNFCC KNXEIDTTCA TKFXFODnL 

001 TCFROWVKQL QIDYPCHBVM TAAASQCLTR RGVYAVRQXV NENPLYAfTS EHVNVLLTRT EDRLVWKTLQ GDPWDCQLTN VFKGNFQA71 EDWBAEHXC1 

1001 IAAINSPAPR TNPPSOCTNV CWAKALEPU. ATAC1VLTCC QWSELFPQFA DOXPHSAIYA LDVICDCFFG MOLTSCUSC QSFLTYHPA DIARPVAHWD 

1101 NSPOTRKYGY DHAVAAEUR RFFYFQLAGX GTQLDLQTCR TRVISAQHNL VFVNRNLPHA LVPEHKEXQP CPVEKFUQF KKUSVLV1SE KXIEAPKXU 

1201 EWIAMGIAO ADJCNYNLAfG FFPQARYDLV FINKjTXYRN HKFQQCEDKA ATUGTLSRSA LNCLNPCCTL WKSYCYADR NSEDWTALA RKTOVSAAR 

1301 PECVSNTEM YUFRQLDNS RTRQFTFHHL NCYBJVYEO TROGVQAAPS YRTXRENUD CQEHAWMAA NFtGRPGBGV CRAIYXRWfN SFTDSATBTG 

1401 TAXLTVCQCX KVIHAVCPOF RKHFEAGALK UQNAYHAVA OLVNEHNIKJ VAIPLLSTCI YAACKDRLEV SLNCXTTALO JtTDADVTTYC LDKXWKEJUD 

1501 AVLQLXESVT ELJCDEDMEID DELVWIHPDS CLXGRXGFCT TTGKLYSYFE GTKFHOAAXO MABOtVLFFN DQESNEQLCA YILGETMEAI REXCFVDKNP 

1601 SSSPPKTLPC LCMYAMTPER VHRLRSNNVK EVTVCOTPt. PKYKDCNVQK VQCTXWL5N FHTFAFVPAR KYtEAfEQFA APPAQAEEAP GWATPTPPA 

1701 ADNTSLDVTD BLDMEDSSE GSLFSSFSGS DMYTUtQVWA DVKAVQEPA? VPTTRUCKMA RLAAARMQEE PTPPAST33A OESLKUFDO VStSFOSLFD 

1101 GEMARLAAAQ PFASTCPTDV PMSFGSFSDO ETPFf^RRVT ESEPVLFGSF EPGBVNSnS SRSAVSFPPR XQRRRRRSRR TEYCLTGVGO T t lWU I OfU 

1901 HLQKKTVLQN QLTEFTIERN VLERIYAPVL OrTSKEEQUCL RYQMMITEAN KSRYQSRXVE NQKA/TTERL UGLRLYNSA TDQPECYX1T YPXFSYSSSV 

3)01 PANYSOPKFA VAVCNNYLHE HYTTVASYQI TDEYDAYLOM VOOTVACLOT ATFCPAXLAS YPKRHEYRAP NIRSAVP3AM QKT1QNVUA ATXRNCNVTQ 

2101 MRELPTLDSA TFNVECFRKY ACNDEYWEEF ARKPIRnTE FVTAYVARLX GPKAAALFAX TKNLVPLQEV PMDRFVMDMK RDVKVTPCTK HTEBRPKVQV 

2201 1QAAEPUSTA YLCQHRELV RRLTAVLLPN 1KTLFDMSAB DPDAUAEHF KQGDPVLETD 1ASFDKSQDD AMALTGLMlt EDLGVDQPLL DUECAFCEI 

2301 SSTHUTOTR FXFGAMMKSG MFITLFVNTV LNW1ASRVL EERIXTSKCA APIGDDN1IH GWSOKEMAE RCATWLNMEV KITDAY1GER PPYFCGGFIL 

2*01 QOSVTSTACR VADPIXRLFK LGXPLfADDE QDEDRRRALL DETXAWFRVO FTDTLAVAVA TRYEVDNTT? VLLALRTFAQ SKRAFQAIRG EOtHLYGGPK 



Amino Acid Sequence of the Structural Polyprotein 



I MNRGFFNMLO RRPFPAPTAM WRPRRRRQAA PMPARNGLAS QKJQLTTAVS ALVtGOATRP QTPRPIIPPPR QKXQAPKQPP KPKKPKTQEX KXXQrAKPXP 

101 GXftQRMAUCL EADRLFOVKN EOGDVIGHAL AMEGXVMXFL HVKGTOHFV tSKUCFnUS AYOMEFAQLP VHMRSEAPTY T3EH7EGFYN WHHOAVQYSG 

2)1 CRFTTPRGVG GRGOSGRPIM DNSGRWAIV LGCADEGTRT ALSWTWNSX 0X1 11 1 1 P EG TEEWSAAFLV TAMCLLGNVS FPCNRPPTCY TREP3RALDI 

301 LEENVNHEAY DTLLNAILRC GSSGRSKRSV TDOFTLTSPY LGTCSYCHKT EPCFSPKIE QVWDEADDNT RIQTXAOFO YDQSGAASSN KYRYMSLEQD 

401 HTVKEGTVIDD KtSTSGPCR RLSYKGYFU. AKCPfGDSVT VS1ASSNSAT SCTMARKKP KFVOREKYDL PPVHGJGCIPC TVYDRUCETT AGYITMKRFO 

SOt PHAYTSYLEB SSGKVYAKPP SGXNTTYECK CCOYXTGTVT TRTEITGCTA DCQCVAYKSO QTXWVFNSPD SIRHADHTAQ GKLHLPFXU P3TCMVFVAH 

601 APffWKGFKH BLQUXfDHL TU.TTRRLGA NPEPTTEW1I GKTVRNFTVO ROGLEYIWCN HEPVRVYAQE SAFGOPHGWP HEIVQHYYHR HPVTTTLAVA 

701 SAAVAMMIGV TYAALCAOCA RRECLTPYAL APKAVIPTSL ALLCCVRSAN AETFTETWSY LWSNSOPFFW VQLCIFtAAV WLMRCCSCC LPFLWAOAY 

101 LAKVDAYEKA TTVfNVKJIP YKALVERAGY APLNLEITVM SSEVLPSTNQ EYTTCKFTTV VPSPKVRCCG SLECQPAAHA OYTCXVFGCV YPFMWGGAQC 

901 FCDSENSQMS EAYVEUVDC ATDHAQAKV KTAAMXVGIJt fVYGKTTSFL DVYYNGVTPG TSKDLXYIAG PISALFT7FD KXWMRGLV YNYDFPEYGA 

1001 MKPOAFGDIQ ATSI.TSROU ASTDIRUJCP SAKNVHVPYT QAA5GFEMWK NNSGRPLQET APFGCKIAVN PLRAVDCSYO NIPtSTOIPN AAF1RTSDAP 

1101 LVSTVKCDV3 ECTYJADFGO MATLQYVSDR EGQCPVKSHS JTATtQESTV HVLEXGAVTV HFSTA3PQAN FTVSLCGKXT TCNAEOCPPA DtOVSTTHKN 

1201 DQEFQAAISX TSWSWLFALF GGASSLU1G LMIFACSMMt TTTRR 
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I NTTONCCGCG TAOTATACAC TATTCAATCA AACACCCGAC CAATTCCACT ACCATCACAA TCGACAACCC AGTAGTTAAC GTAOACOTAG ACCCCCAOAO 
101 TCCGTTTCTC OTCCAACTCC AAAAGaGCTT COCCCAATTT CACCTACTAC CACAGCAGGT CACTCCAAAT CACCATCCTA ATOCCAOACC ATTTTCGCAT 
201 CTCCCCACTA AACTAATCOA CCTCCACCTT CCTACCACAO CGACGATTTT CCACATACCC ACCCCACCCO CTCGTAOAAT OTTTTCCGAO CACCAOTACC 
Ml ATTCCCTTTO CCCCATGCGT ACTCCAOAAG ACCCGGACCG CATOATOAAA TATOCCACCA AACTCCCCOA AAAAOCATCC AAOATTACCA ATAAGAACTT 
401 CCATOACAAO ATCAACCACC TCCCCACCCT ACTTOATACA CCCQATGCTQ AAACCCCATC ACTCTCCTTC CACAACOATO TTACCTCCAA CACGCGTOCC 
501 OACTACTCCC TCATCCACCA CCTCTACATC AACGCTCCCO CAACTATTTA CCATCACCCT ATGAAACGCQ TCCCCACCCT GTACTCGATT CCCTTCOATA 
601 CCACCCACTT CATCTTCTCO CCTATGGCAC CTTCCTACCC TCCCTACAAC ACCAACTCOO CCGACCAAAA AOTCCT COA A CCCCOTAACA TCOOA CiLIU 
701 CACCACAAAO CTCACTCAAG CCAOCACACO AAAGTTCTCO ATAATGACGA AGAACCAGTT CAACCCCCCC TCACCCCTTT ATTTCTCCCT TCCATCQACA 
SOI CTTTACCCAG AACACACACC CACCTTCCAC AGCTGGCATC TTCCATCCGT CTTCCACCTO AAAOOAAACC AGTCGTACAC H UHIU U OATACAOTOO 
901 TCACCTCCCA ACCCTACCTA GTGAACAAAA TCACCATCAC TCCCCCOATC ACGCGACAAA CCGTCCCATA CCCCOTTACA AACAATACCO ACCCCTTCTT 
1001 CCTATCCAAA CTTACCCATA CAGTAAAAGG ACAACCCCTA TCCTTCCCCO TOTCCACCTA TATCCCCCCC ACCATATCCO ATCAQATOAC CCCCATAATQ 
1 101 CCCACCCATA TCTCACCTCA CCATCCACAA AAACTTCTCC TTCCGCTCAA CCACCCAATC CTCATTAACC CTAACACTAA CAOOAACACC AATACCATGC 
1201 AAAATTACCT TCTCCCAATC ATTCCACAAC COTTCACCAA ATCCCCCAAO OACCCCAAAO AAOACCTTOA CAATGAAAAA ATCCTCOCTA CCAOAOACCO 
1301 CAACCTTACA TATCCCTCCT TCTGOOCCTT TCCCACTAAG AAAGTCCACT CCTTCTATCO CCCACCTCGA ACCCAOACCA TCCTAAAACT CCCACCCTCT 
1401 TTTACCCCTT TCCCCATCTC ATCCCTATCC ACTACCIU 1 TCCCCATCTC CCTCACCCAa AACATAAAAT TCCCATTACA ACCAAAGAAC CAGGAAAAAC 
1501 TCCTGCAAGT CCCCCACCAA TTAGTCATCO AGGCCAACGC TCCTTTCGAC GATGCTCAGG ACGAATCCAG AGCGCAGAAO CTCCGAGAAG CACTCCCACC 
1601 ATTAGTGGCA OACAAAGGTA TCGACCCAGC CCCGOAAGTT GTCT C CGAAC TGGACGCGCT CCAGGCCGAC ATCGGAGCAO CACT CG T CO A AACCCCCCGC 
1701 GCTCATOTAA GOATAATACC ACAACCAAAT OACCGTATGA TCGGACAGTA CATCGTTGTC TCGCCAACCT CTCTGCTGAA GAACCCTAAA CTCGCACCAO 
I CO I CACACCCGCT AGCAGACCAC GTTAAOATCA TAACGCACTC CCCAAOATCA GGAAGGTATG CAGTCGAACC ATACQACCCT AAAGTACTGA tcccaccago 

mt aactoccgta ccatggccag aattcttaoc ACTGAGTQAG AGCGCCACCC tagtgtacaa CGAAAGAGAG tttotoaacc gcaagctgta ccat a tt gcc 
2001 atccacgctc ccgctaagaa tacacaaoao OACCAOTACA ACGTTACAAA GGCACAGCTC GCAGAAACAO agtacctgtt tcacgtggac aacaacccat 

2101 GCGTCAAGAA GGAAGAAGCC TCAGGACTTO TCCTCTCCGO ACAACTGACC AACCCCCCCT ATCACGAACT AGCTCTTGAC CGACraAAGA CTCCACOCCT 
2201 CGTCCCGTAC AAGGTTGAAA caatacgagt GATACCCCCA CCAGGATCGG ccaactcggc TATCATCAAG tcaactctca CGGCACGTGA TCTTCTTACC 
2X1 ACCGGAAAGA aagaaaactg ccccgaaatt CACGCCCATG TGCTACCGCT GAGGGCCATG CAGATCACGT COAAGACAGT ccattccgtt atcctcaaco 
1*01 CATCCCCCAA AGCCGTAGAA gtcctctato ttgacoaagc ottcgcgtcc cacgcaggag cactacttgc cttoattcca atcgtcagac cccgtcataa 
aoi ggtagtccta tgcggagacc ctaagcaatg cggattcttc aacatgatgc aactaaacgt atatttcaac cacccggaaa aagacatato taccaaoaca 
2601 ttctacaagt ttatctcccg acgttccaca cagccaotca cggctattct atcgacacto cattacgato gaaaaatgaa aaccacaaac ccgtgcaaga 
2701 agaacatcga aatcgacatt acaggcccca cgaagccgaa cccaggcgac atcatcctga catgcttccg cccctgcgtt aagcaactcc aaatccacta 
2S01 TCCCGGACAT gaggtaatga cagccgcccc ctcacaacgg ctaaccagaa aagcagtata tgccgtccgo caaaaagtca atgaaaaccc cctctacgcg 
2901 atcacatcag agcatgtgaa cgtgctcctc acccgcactg acoacacgct agtatggaaa actttacagg ccgacccatg gattaaccag ctcactaacc 
3001 taccaaaagg aaattttcaa gccaccatcc agcactggga agctoaacac aaggcaataa ttgctgcgat aaacagtccc gctccccgta ccaatccgtt 
3101 cacctgcaag actaacgttt cctccgcgaa acgactggaa cccatactcc ccaccgccgg tatcgtactt accgcttccc agtgoagcga gctgttccca 
3201 cagtttgcag atoacaaacc acactcggcc atctaccccc tggacgtaat ctgcattaag tttttcggca tcoacttgac aagcggactg ttttccaaac 
3301 AGAGCATCCC gttaacgtac catcctgccc attcagcgag gccagtagct cattgggaca acagcccagg aacccgcaag tatcggtacg atcaccccgt 
3401 tgccgccgaa ctctccccta gatttccggt gttccagcta gctcggaaag gcacacagct tgatttgcag acgggcagaa ctagagttat ctccgcacao 

3501 CATAACTTGO TCCCAGTGAA CCGCAATCTC CCGCACGCCT TAGTCCCCGA GCACAAGGAG AAACAACCCG GCCCCGTCAA AAAATTCTTG AGCCACTTCA 
3601 AACACCACTC CGTACTTGTG GTCTCAGAGO AAAAAATTGA AGCTCCCCAC AAGAGAATCG AATCGATCGC CCC3ATTGGC ATACCCGCCG CTGATAAOAA 
3701 CTACAACCTQ CCTTTCCCCT TTCCGCCGCA CCCACGCTAC CACCTCCTGT TTATCAATAT TGCAACTAAA TACACAAACC ATCACTTTCA GCAGTCCCAA 
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3801 CACCATCCCO COACCTTCAA AACCCTCTCO CCTTCGGCCC TCAACTCCCT TAACCCCCGA CCCACCCTCG TCGTGAACTC CTACCCTTAC OCCQACCGCA 
3901 ATAGTOACOA CCTAOTCAOC CCTCTTGCCA CAAAAl 1101 CAGACTCTCT CCACCGACCC CACACTCCCT CTCAACCAAT ACAOAAATCT ACCTtSATCTT 
4001 CCCACAACTA CACAACACCC CCACACCACA ATTCACCCCC CATCATCTCA ATTCTTCTOAT n CO T CCGIU TACGAOGOTA CAACAOACGC ACTTGOAGCC 
4101 GCACCCTCAT ACCCCACTAA AAGGGACAAC ATTCCTGATT CTCAACACCA ACCAGTTGTC AATGCAGCCA ATCCCCTGGG CAGACCAGGC OAACOAGTCT 
4201 CCCGTGCCAT CTATAAACGT TGGOCGAACA CTTTCACCGA TTCAGCCACA GAGACCGGCA CCGCAAAACT GACTGTGTGC CAAGCAAAGA aAGTOATCCA 
4301 CGCGGTTGGC CCTCATTTCC GGAAACACCC AGAGGCAGAA GCCCTGAAAT TCCTGCAAAA CGCCTACCAT GCAGTGGCAG ACTTAGTAAA TGAACATAAT 
«0I ATCAAGTCTO TCGCCATCCC ACTGCTATCT ACAGGCATTT ACGCAGCCGG AAAAGACCGC CTTGAAOTAT CACTTAACTG CTTOACAACC ocgctaoata 
4S01 GAACTGATGC GGACGTAACC ATCTACTGCC TGGATAAOAA OTGGAAGGAA AGAATCGACG CGGTGCTCCA ACTTAAGGAO TCTOTAATAO AGCTOAAGGA 
4601 TGAGGATATG GAGATCGACO ACOAGTTAGT ATCGATCCAT OCGOACAGTT GCCTGAAGGQ AAGAAAGGOA TTCAGTACTA CAAAAGOAAA GTTGTATTCO 
4701 TACTTTGAAG GCACCAAATT CCATCAAGCA GCAAAAOATA TGGCGGAGAT AAAGGTCCTG 7TCCCAAATG ACCAGGAAAG CAACGACCAA CTGTOTGOCT 
4»1 ACATATTGGG GGAGACCATG OAAGCAATCC GCGAAAAATG CCCGGTCGAC CACAACCCGT CGTCTAGCCC GCCAAAAACO CTGCCOTGCC TCTGCATUTA 
4901 TGCCATGACG CCAGAAAGGG TCCACAGACT CAGAAGCAAC AACGTCAAAO AAOTTACAGT ATCCTCCTCC ACCCCCCTTC CAAAGTACAA AATCAAOAAC 
5001 GTTCAGAAGG TTCAGTGCAC AAAAOTAGTC CTGTTTAACC CGCATACCCC TGCATTCGTT CCCGCCCGTA AGTACATAOA AGCGCCAOAA CAGCCTGCAO 
3101 CTCCGCCTCC ACAGGCCGAG CAGCCCCCCC AAGTTGCAGC AACACCAACA CCACCTGCAG CTCATAACAC CTCCCTTGAT GTCACCGACA TCTCACTGOA 
3201 CATGGAAGAC AGTAGCGAAG GCTCACTLl I 1 1 L GAGCTTT AGCGGATCOG ACAACTCTAT TACTAGTATO GACAGTTGGT CGTCACGACC TAOTTCACTA 
5301 GAGATAGTAG ACCGAAGGCA GGTGGTGGTG GCTOACGTCC ATGC C OT CCA AGACCCTGCC CCTGTTCCAC CGCCAACCCT AAAOAAGATG GCCCGCCTCO 
3401 CAGCGGCAAG AATGCAGGAA GAGCCAACTC CACCGGCAAG CACCAGCTCT GCCGACGAGT CCCTTCACCT TTCTTTTGGT GGGGTATCCA TGTCCTTCCG 
5501 ATCCCTTTTC GACGGAGAGA TGGGCGCCTT CGCAGCGGCA CAACCCCCGG CAAGTACATG CCCTACGGAT GTCCCTATGT CTTTCGGATC GTTTTCCGAC 
5401 GGAGAGATTG AGGAGCTGAG CCGCAGAGTA ACCGAGTCTG AGCCCCTCCT GTTTGGGTCA TTTGAACCGG GCGAAGTQAA CTCAATTATA TCGTCCCGAT 
3701 CAGTTGTATC 111 1LCACCA CGCAAGCAGA GACGTAGACG CAGGAGCAGG AGGACCGAAT ACTGACTAAC CGGGGTAGGT GGGTACATAT TTTCGACCGA 
5S0I CACAGGCCCT GGCCACTTGC AAATGGAGTC CGTTCTGCAO AATCAGCTTA CAGAACCGAC CTTGGAGCGC AATGTTCTGG AAAGAATCTA CGCCCCGOTO 
5901 CTCGACACGT CGAAAGAGGA ACAGCTCAAA CTCAGGTACC AGATGATGCC CACCGAAGCC AACAAAAGCA GGTACCAGTC TAGAAAAGTA GAAAATCAOA 
6001 AAGCCATAAC CACTGAGCGA L ILL IN LAO GGCTACGACT GTATAACTCT GCCACAGATC AGCCAGAATG CTATAAGATC ACCTACCCGA AACCATCGTA 
6101 TTCCACCAGT GTACCGGCGA ACTACTCTGA CCCAAACTTT GCTGTAGCTG TTTGCAACAA CTATCTGCAT GAGAATTACC CGACGGTAGC ATCTTATCAO 
6201 ATCACCCACO AGTACGATCC TTACTTGGAT ATGGTAGACG GGACAGTCGC TTGCCTAGAT ACTGCAACTT TTTGCCCCCC CAACCTTAGA AGTTACCCGA 
6301 AAAGACACGA GTATAGACCC ccaaacactc GCAGTGCGGT TCCATCAGCG ATGCAGAACA cgttgcaaaa CGTGCTCATT CCCCCOACTA AAAGAAACTG 
6401 CAACGTCACA CAAATGCGTO AATTGCCAAC ACTGGACTCA GCGACATFCA ACGTTGAATG CTTTCGAAAA TATCCATGTA ATGACGAGTA TTGGGAGCAO 
6301 TTTGCCCGAA AGCCAATTAO GATCACTACT GAL I ILL II A CCGCATACGT GGCCAGACTG AAAGGCCCTA AGCCCGCCCC ACTGTTCGCA AAGAOGCATA 
6601 ATTTGGTCCC ATTGCAAGAA GTGCCTATGG ATAGGTTCGT CATGGACATG AAAAGAGACG TGAAAGTTAC ACCTGGCACC AAACACACAO AAGAAAGACC 
6701 CAAAGTACAA GTGCTACAAG CCGCAGAACC CCTGGCGACC GCTTACCTGT GCGGGATCCA CCGCGAOTTA GTGCGCAGGC TTACAGCCGT CTTGCTACCC 
6801 AACATTCACA CGCTTTTTGA CATGTCGGCG CAGGACTTTG atccaatcat AGCAGAACAC ttcaagcaag GTGACCCGGT actgcagacg GATATCGCCT 
6901 CGTTCGACAA AAGCCAAGAC GACGCTATGG CCTTAACTGO CCTGATGATC TTGGAAGACC TGGGTGTGOA CCAACCACTA CTCGACTTGA TCGAGTGCCC 
7001 CTTTCGAGAA ATATCATCCA CCCATCTGCC CACGGGTACC CGTTTCAAAT TCCGCCCCAT GATGAAATCC GCAATGTTCC TCACGCTCTT TCTCAACACA 
7101 CTTCTGAATC tccttatcgc CAGCAGAGTA TTGGAGGAGC ggcttaaaac gtccaaatgt GCAGCATTTA TCGGCGACGA CAACATCATA CACCGAGTAG 
7101 TATCTCACAA AGAAATGCCT GAGACGTGTG ccacctggct CAACATGGAG GTTAAGATCA TTGACCCACT CATCGCCGAO AGACCGCCTT acttctocgo 
7301 TCOATTCATC TTGCAAGATT cggttacctc cacagcgtct cgcgtggcgg ACCCCTTGAA AACCCTGTTT AACTTCGGTA aaccgctccc agccgacgac 
7401 GAGCAAGACG aagacagaag acgcgctctg ctagatgaaa caaaggcgtc gtttagagta ggtataacag acaccttagc agtggccgtg gcaactccgt 
7301 atgaggtaga caacatcaca cctgtcctgc tggcattgag aacttttgcc cagagcaaaa gagcatttca agccatcaga ggggaaataa agcatctcta 
7601 cggtggtcct aaatagtcag catagcacat ttcatctgac taataccaca acaccaccac catgaataga ggattcttta acatgctcgg ccgccgcccc 
7701 ttccccgccc ccactgccat gtggaggccg ccgacaagga ggcagccggc cccgatgcct ccccgcaatg ggctggcttc ccaaatccag caactgacca 
7801 cacccgtcag tgccctagtc attggacagg caactagacctcaaacccca cgcccacgcc cgccgccgcg ccagaagaag caggcgccaa agcaaccacc 
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7Wt CAACCCCAAC AAACCAAAAA CACAGGAGAA GAAGAAGAAG CAACCTCCAA AACCCAAACC CCGAAAGACA CAACGTATGG CACTCAACTT GGACCCCOAC 
£001 AGACTGTTCG ACCTCAAAAA TCACCACCCA CATCTCATCC CCCACCCACT CCCCATCCAA CCAAACCTAA TGAAACCACT CCACOTCAAA GGAACTATTG 

accaccctgt cctatcaaao ctcaaattca ccaact c ctc aocatacoac atcgagttcc cacagttgcc cctcaacatg agaaotoagg ccttcaccta 

CACCACCCAA CACCCTGAAG CCTTTTACAA CTCGCAOCAC GGAGCGCTGC AGTATAGTGG AGGTAGATTT ACCATCCCCC CCCCACTACO AQOCAGAGOA 
GACACTCOTC CTCCGATTAT CCATAACTCA GGCCGGOTTO TCCCCATACT CCTCGOACCa CCTOATOACa GAACAAGAAC T OCO.M1CU OTCOTCACCr 
COAATACCAA ACCGAAGACA ATCAACACAA CCCCCCAACO GACAOAACaO TCCTCTCCAC CACCACTCCT CACCGCCaTG TCCTTCCTTC GAAACOTCAO 
CTTCCCATCC AATCGCCCGC CCACATCCTA CACCCCCCAA CCATCCACAC CTCTTCACAT CCTTGAAOAa AACCTQAACC ACCACCCCTA COACACCCTO 
CTCAACCCCA TATTCCCOTC CGGATCGTCC GGCAGAAGCA AAAGAACCCT CACTCACCAC TTTACCTTCA CCACCCCCTA CTTGGGCACA TCC T C OT A CT 
CTCACCATAC TGAACCCTCC TTTACCCCCA TTAAOATCGA CCACOTCTCO GATCAACCCO ACCACAACAC CATACGCATA CAOACTTCCO CCCAGTTTCO 
ATACGACCAA AGCGOAGCAG CAAGCTCAAA TAAGTACCCC TACATGTCGC TCGAGCACGA TCATACCGTC AAAGAAGGCA CTATCCATGA CATCAAGATC 
AGCACCTCAG GACCGTGTAO AAGGCTTACC TACAAACOAT ALIHUCLT CGCGAAGTGT CCTCCAGGGO ACAGCGTAAC GOTTACTATA GCGAOTAQCA 
ACTCAGCAAC GTCATGCACA ATGGCCCGCA AGATAAAACC AAAATTCGTG GGACGGGAAA AATATGACCT ACCTCCCGTT CACGGTAAGA AGATTCCTTC 
CACAGTGTAC GACCGTCTGA AAGAAACAAC CGCCGCCTAC ATCACTATGC ACAGGCCGGG ACCGCACGCC TATACGTCCT ATCTGGAGCA ATCATCAGGG 
AAAGTCTACG CGAAGCCACC ATCCGGAAAG AACATTACGT ACGAGTGCAA GTCCGGCGAT TACAAGACCG GTACCGTTAC CACCCGTACC OAAATCACCO 
OCTGCACCCC CATCAAGCAG TGCGTCCCCT ATAAGAGCGA CCAAACGAAG TGGGTCTTCA ATTCGCCGGA CTTOATCAOA CATCCCCACC ACACGGCCCA 
AGGQAAATTG CATTTACCTT TCAAGCTGAT CCCGAGTACC TGCATGGTCC CTCTTCCCCA CGCGCCGAAC GTAOTACACG GCTTTAAACA CATCAGCCTC 
CAATTACACA CAGACCACCT GACATTGCTC ACCACCAGGA GACTAGOGGC AAATCCGGAA CCAACTACTG aatgoatcat CGGAAAOACG GTTAGAAACT 
TCACCGTCCA CCGAGATGCC CTGGAATACA TATGGGGCAA TCACGAACCG GTAAGGGTCT ATGCCCAAGA CTCTGCACCA GGAOACCCTC ACGGATGGCC 
ACACGAAATA GTACAGCATT actaccatcg CCATCCTGTG TACACCATCT TAGCCGTCGC ATCAGCTGCT CTGGCOATCA TGATTOOCGT aactottcca 
CCATTATGTC CCTGTAAAGC GCGCCtTTGAG TGCCTGACGC CATATCCCCT GGCCCCAAAT GCCCTGATTC CAACTTCGCT CCCACTTTTG TGCTCTOTTA 
GGTCGGCTAA TGCTGAAACA TTCACCGAGA CCATGAGTTA CCTATGGTCG AACAGCCACC CA7TCTTCTG GGTCCAGCTC TGTATACCCC TGCCCGCTOT 
CATCGTTCTA ATGCGCTGTT GCTCATCCTO CCTGCCTTTT TTAGTGGTTO CCGGCGCCTA CCTGGCCAAO GTAGACGCCT ACGAACATGC OACCACTGTT 
CCAAATGTGC CACAGATACC GTATAAGGCA CTTGTTGAAA GGGCAGGGTA CGCCCCGCTC AATTFGGAGA TTACTGTCAT GTCCTCGCAG OMllGLCi r 
CCACCAACCA AOAOTACATC ACCTGCAAAT TCACCACTGT CO TCCCCTCC CCTAAAGTCA AATGCTCCOO CT C CTT OG AA TGTCAGCCCG CCCCTCACCC 
AOACTATACC TGCAAGOTCT TTGGAGGGGT GTACCCCTTC ATGTGGGGAG GAGCACAATO TTTTTCCGAC AGTGAGAACA GCCAGATGAG TGACGCOTAC 
CTCGAATTGT CAGCAGATTG CGCCACTGAC CACGCGCAGG CGATTAAGGT CCATACTGCC GCGATGAAAO TAGGACTACG TATACTGTAC GCGAACACTA 

ccagtttcct AGATGTGTAC GTGAACGGAG tcacaccagg aacgtctaaa GACCTGAAAG tcatagctcg accaatttca gcatcgttta caccattcoa 

TCACAAGGTC GTTATCCATC CCCGCCTGGT GTACAACTAT GACTTCCCGG AATACGGAGC CATGAAACCA GGAGCGTTTG GAGACATTCA AGCTACCTCC 

ttgactagca aagatctcat cgccagcaca gacattagac TACTCAAGCC ttccgccaag aacctccato tcccgtacac CCACGCCOCA tctcgattcg 
agatctggaa aaacaactca ggccgcccac tccacgaaac cgcccctttc cggtgcaaga ttgcagtcaa tccgcttcca GCGCTGGACT gctcatacgg 

GAACATTCCC ATCTCTATCG ACATCCCGAA CGCTCCCTTT ATCAGGACAT CaGATGCACC ACTCGTCTCA ACAOTCAAAT GTCaTCTCAO TGAGTCCACT 
TACTCAGCGG ACTTCCCCGG GATCGCTACC CTGCAGTATG TATCCGACCG CGAAGGACAA TCCCCTGTAC ATTCGCATTC GAGCACAGCA ACCCTCCAAG 
agtcoacagt TCATGTCCTG GAGAAAGGAG CGGTGACAGT ACACTTCAGC accgcgagcc CACACCCGAA CTTTATTGTA TCGCTGTGTO GTAAGAAGAC 
AACATGCAAT GCAOAATCCA aaccaccagc tgaccatatc gtgagcaccc cgcacaaaaa tgaccaagaa ttccaagcco CCATCTCAAA AACTTCATGG 
AGTTGGCTGT ttgccctttt CGGCGGCCCC TCGTCGCTATTAATTATACG acttatcatt tttccttcca gcatgatgct gactagcaca cgaagatgac 
cgctacgccc caatgacccg accagcaaaa ctcgatgtac ttccgaggaa ctgatgtgca taatgcatca ggctcgtata ttagatcccc ccttaccccg 
ggcaatatag caacaccaaa actcgacgta tttccgagca agcgcagtgc ataatcctcc gcactgttgc caaataatca ctatattaac catttattta 

GCGCACGCCA AAACTCAATG TATTTCTGAG GAAGCATGCT GCATAATGCC ATCCAGCGTC TGCATAACTT TTTATTATTT CTTTTATTAA TCAACAAAAT 
J M (j 1 1 1 1 1 A ACATTTN 
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^ # Amino Acid Sequence of the Nonstructural Polyprotein 



1 MEJCfWWVOV DPQSPPWQL QK5FPQPBW AQQVTPNDHA HAftAPSHLAS KUEUVPTT ATILDIGSAP ARRMFSEHQY KCVCPMRSPB DPDRMMJCYA5 

101 KLAatAOCrr WCNLHHCOCO UTVLBTPOA ETPSLCPHND VTCNTRAEYS VMQOVYINAP CTTYKQAMKO VRTLYWIGFD TTQFMFSAMA GSYPAYNTNW 

201 ADEXVLEARN ICLCJTXLSE GRTGKLSWR K XEUCPG SRV YPSVCSTLYP EHRASLQSWH LPSVPHUCGK QSYTCRCDTV VSCEGYWXX HHPiiHUE 

301 TVCYAVTHMS ECFLLOCVTD TVKGERVSP? VCTYIFATTC D QMTGttlATP tSPDDAQXLL VCLKQWVW GJCTNRKTNTM QNYLLPOAQ GfSKWAXSUC 

401 EOUJNDCMLO TRERKLTYGC LWAFR7XKVH SFYRPPG7QT IVKVPA3FSA FPMSSVWTTS UMSLROXJX LALQPKXEEX UQVPEELVM EAJCAAPEDAO 

301 EE5RAEKLRS ALPPLVADKG IBAAAEYVCB VECLQAOIOA ALYETPROHV RflPQANDRM ICQYIYVSIT SVUCKAKUP AHPLAOQVXI mCOKSOXY 

<Ol AVBPYOAKVL MPAOSAVPWP EFLALSESAT LYYNEREPVN UCLYHIAMHO PAKNTEEEQY KVTXAELAET EYVFDVDKKJt CVKKEBASGL VLSOBLTNPP 

101 YHELALECLK TRPWPYKVE TTGVIGAPGS CXSAIDCSTV TAR0 LVT30X KENCREtQAD VLRUtOMQCT 5JCTYDSVMLN COUCAYEVLY VDEAFAOCAG 

101 ALLAUAIVR PtHKWLCGD PKQCCPFNMM QUVYFNHPE KDICTKTFYX FISRRCTQPV TAIVJOHYD GXMKTTNPCX KMEHMTGA TXPKPODIIL 

»1 TCftOWVXQL QIDYPGH BVM TAAXSQGLTX KOVYAVIQXV NENPLYAITS EHVNVLLYRT EDRLVWICTLQ ODPWDCQLTN VPKONFOATI EOWEACKXC1 

1001 I AAIHSP APR YNPPSCXTNV CWAJCRLEPIL A TAGJV LT GC Q WSELPPQPA DDXPKSAtYA LDVtCTKFFO MDUSGLFSX QSIPLTYHPA DSARPVAHWD 

1101 NSPGTRXYGY DKAVAAELSR RFFVFQLAGK GTQLDLGTGR TRVISAQHNL VPVNRNIPHA LVPEHKEKQP GPVXKFUQP KHKSVLWSB EKRAFHKRI 

1201 EWIAPIOIAO AOKNYNLAFG PPPQARYDLV FINTOTXYJLN HHFQQCBDKA ATUCTLSRSA LNCLNPGCTL WKSYOYAOR H5EOWTALA RJCFVRVSAAR 

1301 PECV3SHTEM YLIFtQLONJ RTRQPTPKHL KCVBSVYEO TRDOVOAAPS YRTKRENIAD CQEEAWNAA NPLGRPGEQV CRAIYXRWPN SFTDSATB7G 

1*01 TAKLTVCQGX KVIHAVGPDP RXHPEAEAUC LLQNAYKAVA OLVNEWflXS VAIPLLSTCI YAACKORLEV SLNCLTTALO RTDADVTTYC LDKKWXBJUD 

UOl AVLQUCBSVt ELXDEDMEIO DELVWIHPDS OJC GRKCPST TXCXLYSYFE GTKPHQAAXD MAEDCVLFPN DQESNEQLCA YILGETMEAI RSCCPVDKNP 

1601 SSSPPKTUC LCMYAMTPER VHRLRSNNVK EVTVCSSTPL PKYXKNVQK VQCTKWUN PKTPAFVPAR KYIEAPEQPA APPAQAEEAP EVAATPTPPA 

1TOI ADKTStOVTO BLOMEOSSE CSUS3FSCS ONSTTSMDSW SSGPSSLEIV DRRQVWADV HAVQEPAPVP PPRUCKMARL AAARMQEEFT PPA5TSSAOB 

1101 SLKUFCGVS MSFGSLFDGE MCALAAAQPP A5TCPTDVPM 3FGSFSDGE1 EEURRVTES EPVLFG3FEP GEVNStSSR SWSFPPRKQ RRRRRSRRTB 

1901 Y 



3, Amino Acid Sequence of the Structural Polyprotein 

1 MNRCFFNMLG RRPFPAPTAM WRPRRRRQAA PMPARNGLA3 QtQQLTTAVS ALV1CQATRP QTPRPRPPPR QKXQAPKQPP KFKXPETQEK XJCXQPAXPKP 

101 CKRQRMALXL EADRLFDVKN EOGOVICHAL AMEGXVMXPL KVKOTTDHPV UXLKPTXU AYDMRPAQLP VNMRSSAFTY TSEKPEGFYN WHHOAVQYSO 

201 CRPYIPRGVC CRCOSCRPIM DKSCRWAIV LCGADEOTRT AUWTWN5K CXTDCTTPEC TEEWSAAPLV TAMCLLGKVJ PPCNRPPTCY TXEPSRALDf 

301 LEENVKHEAY PT UWAlLRC GSSGRSKRSV TDDPTtTSPY LGTCSYCHHT EPCPSPIKII QVWDEADDWT IXIQTSAQFO YDQSOAASSM KYRYMSLEQD 

401 KTVKECTMDO IKSTSGPCR RLSYKGYFLL AKCPPOWVT VSIASSNSAT SCTMARXOCP KFVCREXYDL PPVHOXXIPC TVYORUCETT AGYTTMHRPO 

301 PHAYT3YLEB SJOKVYAKPP SGKN1TYECK CCD YKTCT VT TRTQTGCTA OCQCVAYKSD OTKWVPNJPD URHADHTAQ CKLHLPFKU PSTCMVPVAH 

601 APNWHGPXH tSLOLOTDHL TLLTTRRLGA NPEPTTEWH GXTVRNPTVD RDGLEYIWGN HEPVRVYAQB SAPGOPHGWP HEIVQHYYHR HPVYTOAVA 

701 5AAVAMMKJV TVAALCACXA RRECL7PYAL APHAVIPTSL AIACCVRlAW AETFTBTMSY LWSNSQfFFW VQU3PUIAV IVLMRCCSCC LPFLWAGAY 

Wl LAKVDAYEHA TTVPNVPQIP YKALVERAGY APUOEirVM SSEVLPSTNQ EYfTCKPTTV VPSPKVKCCG SUECQPAAHA OYTOCVPGGV YPPMWGOAQC 

901 FCDSENSQMS CAYVELSADC ATDHAQADCV KTAAMKVGLR IVYGKTTSFL OVYVNGVYPG T5KOUCVUG flSAJFTPFO HKW1HRGLV YNYDPPBYGA 

IMI MKPGAFGDIQ AT5LTSKDU ACTOrRLUCP SAKNVHVPYT QAASGFEMWK NNSCRPLQET APFCOCIAVM PLRAVDCJYG NtPOIDCPN AAFIRTSDAP 

UOl LVSTVKCDVS ECTYSADFGG MATLQYVSOR EGQCPVHSHS 5TATLQESTV KVLEKGAVTV HFSTASPQAN FTVSLCCKKT TCNAECXPPA OHTVSTPHKN 

1201 DQEFQAA1SK TSWSWLFALP GGASSLUIG LMIPAC3MML TCTRR 
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Nucleotide Sequence of S55 

I ATTCCCCCCC TAGTACACaC TATTCAaTCA AACAGCCCaC CAATTOCACT ACCATCACAA TCCaGAAOCC ACTAGTTAAC gtagacgtac accctcacac tccctttotc gtocaactcc 
121 AAAAGAGCTT CCCCCAATTT GAGGTAGTAG CACAGCAOCT CACTCCAAAT CACCATCCTA ATOCCACACC ATTTTCGCAT CTGCCCACTA AACTCATCCA CCTGCAGCTT CCTACCACAC 
Ml CGACCATTTT CCACATAOCC aocccacccc ctcctacaat cttttcccac CACCACTACC ATTGCGTTTG ccccatccct AGTCCAGAaO ACCCCCACCC CATCATGAAA tatcccacca 
361 AACTCGCCGA AAAACCATCT AAGATTACAA ACAaCAACTT GCATGAGAAG ATCAAGCACC TCCGCACCGT ACTTCATACA CCCGATCCTC AAACOCCATC ACTCTCCTTC CACAACGATO 

*»i ttacctccaa cacccgtdcc gagtactccg tcatccacca cgtgtacatc aaccctcccc caactattta ccaccaccct atcaaagccc tccccaccct CTACTOCATT occttccaca 

Wl CCACCCACTT CATCTTCTCG CCTATOGCAO CTTCCTACCC TCCATACAAC ACCAACTCCC CCCACCAAAA ACTCCTTCAA OCCCCTAACA TCGCACTCTC ^'fCACAAAO CTOAGTOAAC 
721 CCACGACAOC AAAGTTCTCG ATAATCACCA ACAAOCACTT OAACCCCCCO TCACCCCTTT AI 1H.1U. CT TOCATCCACA CTTTACCCAG AACACACACC CACCTTGCAO ACCTCGCATC 
Ml TTCCATCGCT CTTCCACTTC AAACCAAACC AGTCGTACAC TTGCCGCTGT CATACACTCC TCACCTOCCA ACCCTACOTA CTCAACAAAA TCACCATCAG TCCCCCOATC ACGCCACAAA 
961 CCCTOCCATA CGCOCTTACA AACAATACCO AOOCCTTCTT CCTATCCAAA CTTACCCATA CAGTAAAAGG ACAACCCCTA TCCTTCCCCO TCTCCACCTA TATCCCCCCC ACCATATCCG 
(031 ATCACATCAC CCCCATAATC CCCACCCATA TCTCACCTCA CGATCCACAA AAACTTCTCC TTCCGCTCAA CCACCOAATC CTCATTAACG GTAAOACTAA CAGGAACACC AATACCATGC 
1101 AAAATTACCT TCTGCCAATC ATTGCACAAG GCTTCAGCAA ATGOG C CAAG GAGCGCAAAa AAGATCTTCA CAATCAAAAA ATGCTGCCCA CCAGAGaGCG CAAOCTTACA TATOCCTCCT 
mt TGTGGGCGTT TCGCACTAAG AAACTGCACT CGTTCTATCC CCCACCTCGA ACGCAGACCA TCGTAAAACT CCCAGCCTCT TTTACCCCTT TCCCCATCTC ATCCCTATCO ACTA U.1L1 1 
l**| TCCCCATCTC CCTGACOCAO AAGATCAAAT TCCCATTACA ACCAAAGAAG CAGGAAAAAC TGCTGCAAGT CCCGGAGGAA TTACTTATCO AGOCCAAGOC TGCTTTCGAC GATCCtCACO 
1361 ACGAATCCAO AGCCGACAAO ctcccagaag cactcccacc ATTACTCCCA GACAAAGCTA TCCAGCCAGC TCCGGAACTT GTCTGCCAAO TCOAGGCCCT CCAGGCCCAC ACCCGAGCAG 
1631 CACTCGTCCA AACCCCGCCC GCtCATCTAA GGATAATACC TCAAGCAAA T GACCOTATGA TCGGACACTA TaTCGTTCTC TCCCCCATCT CTCTCCTCAA GAACGCTAAA CTCCCACCAa 
l»l CACACCCCCT AOCACACCAQ GTTAACATCA TAACGCACTC CGCAAGATCA ggaacgtatg cactccaacc atacgacgct AAAGTACTCA tgccaocago AACTOCCCTA CCATGGCCAG 
mi AATTCTTAGC ACTGACTGAG ag c gcca c gc TTCTCTACAA CGAAAGACAG TTTCTGAACC GCAAGCTGTA ccatattgcc atccacggtc ccgctaacaa tacagaagao oaccagtaca 
1041 ACGTTACaAA GGCAGACCTC GCAGAAACaC ACTACCTCTT TCACOTGCAC AACAAGCCAT CCGTTAACaa gcaaGaaccc tcagcacttg tcctttccgg acaactgacc aacccoccct 
1161 ATCACGAACT AGCTCTTCAO OGaCTCAAGA CTCCACCCGC GGTCCCCTAC AAGGTTGAAA CAATAGGAGT CAT/ -t-.-.ca ccaggatcgg gcaaotcagc tatca.-^o tcaactgtca 
2211 ccgcacctca tcttcttacc agcggaaaga aagaaaactg ccgccaaatt gaggcccacc tcctacggct caggggcatg cacatcacct cgaacacaot ggattccctt atcctcaacg 
2401 GATOCCACAA agccctagaa gtgctctatg ttgaccaagc cttcccctgc cacgcaggac cactacttcc cttcattgca atcctcacac cccgtaagaa gctactacta tgcggagacc 

2»f CTAAGCAATO CGGATTCTTC AACATCATCC AACTAAAGCT ACATTTCAAC CACCCTGAAA AAGACATATG TACCAAOACA TTCTACAACT TTATCTCCCC ACGTTGCACA CAGCCACTCA 
26*1 CGOCTATTCT ATCOACACra CATTACGATO GAAAAATGAA AACCACAAAC CCGTCCAAGA ACAACATCCA AATCGACATT ACAGGGGCCA CGAACCCOAA GCCACGGGAC ATCATCCTCA 

»6t catctttccg cccgtgggtt aagcaactgc aaatccacta tcccggacat cacctaatca caccccccgc ctcacaaggo CTAACCACAA aaGGAGTATA TGCCGTCCCG CAAAAACTCA 
Wl ATGAAAACCC CCTCTACGCO ATCACATCAC AGCaTGTGAA cctcttoctc acccccactc AGGACACCCT ACTATCGAAA ACTTTACACC GCGACCCATG CATTAAGCAO ctcactaacg 

3001 tacctaaago aaattttcag gccaccatco aggactcgga acctbaacac aagggaataa ttgctocgat aaacactccc cctcccccta ccaatccctt cagctgcaao actaaccttt 

3121 CCTCGGCCAA ACCACTCCAA CCGATACTGO CCACGCCCGG TATCCTACTT ACCCGTTGCC ACJTGCACCCA CCTCTTCCCA CACTTTGCCG ATCACAAACC ACACTCCCCC ATCTACGCCT 
33*1 TACACOTAAT TTGCATTAAG TTTTTCGCCA tccacttcac AACCGGGCTC TTTTCCAAAC ACAGCATCCC CTTAACCTAC CATCCTCCCC ACTCAGCCAC GCCACTAGCT CATTCCCACA 
3331 ACAGCCCAGG AACACOCAAG TATGGCTACG ATCaCGCCCT TCCCCCCOAA CTCTCCCCTA GATTtCCOGT CTTCCAGCTA GCTGCCAAAG gcacacacct tgatttccao ACGGGCAGAA 
J*flt ctacacttat CTCTCCACAG CATAACTTCC TCCCAGTGAA CCCCAATCTC CCTCACCCCTTACTCCCCCA ccacaaggao AAACAACCCG ccccgctcca AAAATTCTTO agccagttca 
3601 AACACCACTC cctacttctc atctcacaca aaaaaattca aoctccccac aacacaatcg aatggatcgc ccccattggc ataccccccg cagataagaa ctacaaccto uliulguj 
mi ttccccccca ggcaccctac cacctgctct tcatcaatat tccaactaaa tacagaaacc atcactttca acactccgaa gaccacgcco cgaccttcaa aaccctttcc ccttcccccc 
18*1 tcaactgcct taaccccgca cccaccctco tgctcaactc ctacccttac ccccaccgca atactcacga cctactcacc CCTCTTCCCA caaaatttct cagactctct ccaccgacgc 

3981 CACACTCCCT CTCAACCAAT ACACAAATGT ACCTCATTTT CCGACAACTA GaCAACAGCC GCACACGaCA aTTCACCCCC CATCATTTCA ATTCTGTCAT f l tOlLLOIU taccaggcta 
«S1 CAACAGACCa AGTTGCACCC GCACCGTCCT ACCCTACTAA AAGGGACAAC ATTCCTCATT ctcaagagga aCCACTTCTC AaTGCAGCCA ATCCACTCGG CACACCAGCA CAACCACTCT 
4201 GCCGTCCCAT CTATAAACGT TCCCCCAACA CTTTCACCCA TTCAGCCACA GAGACAGCTA CCCCAAAACT CACTCTCTCC CAAGGAAACa AACTCATCCA CCCCCTTOGC CCTCATTTCC 
*n\ CCAAACACCC AGAGGCAGAA CCCCTCAAAT TCCTGCAAAA CGCCTACCaT GCAGTCCCAO ACTTACTAAA TCAACATAAT ATCAACTCTC TCGCCATCCC actgctatct acacgcattt 

*4*t ACCCAGCCCG aaaacacccc cttcacgtat cacttaacto cttcacaacc gcgctagaca gaactcatgc ggacctaacc atctactccc tggataagaa ctccaaggaa agaatccacc 
4J61 CCCTCCTCCA acttaaggag tctctaactc agctcaagca tgaggatatg cagatcgacg accacttact atccatccat ccgcacactt ccctcaaggg aagaaaggca ttcactacta 

4Mt CAAAAGGAAA CTTCTaTTCG TACTTTCAAG GCACCAAATT CCATCAACCA CCAAAAGATA TGGCCGAGaT AAACCTCCTC TTCCCaaaTG ACCAGGAAAC CAACGAACAA CIC1ULLU 
4C01 ACATATTCGG GGACACCATC CAAOCAATCC CCGAAAAATG CCCOCTCCAC CaCaaCCCCT CGTCTAGCCC GCCAAAAACC CTCCCGTCCC TCTCTATCTA TCCCATCACG CCACAAACCG 
4«l TCCACAGACT CAGAAGCAAT AACCTCAAAO AACTTACACT ATGCTCCTCC ACCCCCCTTC CAAACTACAA AATCAACAAT CTTCACAACC TTCACTCCaC AAAAGTACTC CTGTTTAACC 
S041 CCCATACCCC CCCATTCGTT CCCGCCCCTA ACTACATAGA ACCACCACAA CAGCCTCCAC CTCCCCCTCC acaggccgao cacgcccccg CaGTTUTAGC CACACCAACA CCACCTGCAG 
3161 CTGATAACAC CTCCCTTCAT GTCACCCACA TCTCACTGCA CATGGAACaC* "aCTAGCGAAG GCrCACTCTT TTCCaCCTTT AOCGGATCCG ACAACTACCG AAGCCaGCTC mU4ILU.lt 
51S1 ACCTCCATCC CCTCCAAGaC CCTCCCCCTC TTCCACCCCC AaGCCTaaaG AACATCGCCC CCCTCCCaCC GGCAACAATC CaCCAAGaGC CAACTCCACC GCCAACCACC AGCTCTCCCG 
S*OI ACCACTCCCT TCaCCTTTCT TTTCATGGCG TATCTATaTC CTTCCGATCC CTTTTCCACG CaCaCATCCC CCGCTTGGCA GCGGCACAAC CCCCCCCAAG TACaTCCCCT ACCCATCTGC 
«2I CTATCTCTTT CGGaTCCTTT TCCCACGGaG ACaTTGAGGA CTTGaCCCGC aCaCTAACCG AGTCCCAGCC CCrCCTCTTT CCCTCaTTTO AACCCGCCCA ACTCAACTCA ATTATATCGT 
36*1 CCCCATCACC CCTATCTTTT CCACCACGCA AGCAGACaCG TAGACGCAGG AGCAGCAGGA CCCAATACTG TCTAACCCCG CTAGCTCCCT ACATATTTTC CaCCCACACA GCCCCTCCGC 
f«l ACTTCCAAAA CAAGTCCCTT CTGCaCAACC AGCTTACAGA ACCCACCTTG GAGCCCAATG TTCTCCAAAC AATCTaCCCC CCCCTGCTCC ACACCTCCAA AGACCAACAG CTCAAACTCA 
SMI GCTACCACAT CATCCCCACC G AAGCCAACA AAACCAOGTA CCAGTCTCGA AAACTAGAAA ACCACAAAGC CaTAACCACT CAGCCACTCC TTTCAGCGCT ACG GC TCT A T AACTCT GC CA 
M01 CACATCACCC ACAATCCTaT AACATCACCT ACCCCAAaCC ATCGTaTTCC ACCACTCTAC CACCCAACTa CTCTCACCCA AACTTTCCTC TAGCTCTTTO TAACAACTAT CTGCATCACA 
6121 ATTACCCCAC GCTACCATCT TATCACATCA cccaccacta CGATGCTTAC TTGCATATGG TaGaCCGGAC aCTCCCTTGC CTAGATACTG CAACTTTTTC CCCCCCCAAG CTTACAACTT 
4241 ACCCCAAAAG acacoactat ACACCCCCAA ACATCCCCAC TGCCCTTCCA TCACCGATGC ACAaCACCTT gcaaaacctc ctcattcccc CCACTAAAAG AAACTGCAAC CTCACACAAA 
«34l TCCCTCAACT gccaacactc CACTCACCCA CATTCAACCT TGAATGCTTT CCAAAATATG catgcaatca cgagtattgg caccactttc cccgaaaccc aattagcatc actactcagt 
6*01 TCCTTACCCC ATACCTCGCC ACACTCAAAG GCCCTAACCC CCCCCCACTG TTCCCAAACa CCCATAATTT CCTCCCATTa CAaGAACTCC CTATCCATAC ATTCCTCATG CaCaTCAAAA 
6601 GACACGTGAA ACTTACACCT ggcaccaaac ACACAGAACA AAGACCGAAA CTACAAGTCA TACAACCCCC AGAACCCCTO CCCACCCCTT ACCTATGCCC catccaccgg CACTTACTCC 
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6711 GCAGGCTTAC AGCCGTTTTG CTACCCAACA TTCACACGCT CTTTGaCaTCi TCCOCGGaGG actttcatcc AATCATACCA gaacacttca agcaaggtca ccccgtactc gagacooata 
4041 tcgcctcgtt cgacaaaagc caacaccaco ctatgccctt aaccggccto atcatcttcc aacacctcgc tctccaccaa ccactactco acttcatcca gtgccccttt ocacaaatat 

Wftl CATCCACCCA TCTGCCCACC CCTACCCCTT TCAAATTCCC GGCGATCATG AAATCCGCAA TCTTtCTCAC CCTCTTTCTC AACaCACTTC TCAATCTCCT TATCGCCAGC AflACTATTCO 
7031 ACCAGCCCCT TAAAACGTCC AAATGTGCAG CATTTATCGG CCACCACAAC ATTATACACC CAGTACTATC TCaCAAACAA ATGGCTCACA GGTGTGCCAC CTCGCTCAAC ATCCACCTTA 
HOI ACATCATTCA CCCACTCATC CGCGAGAGAC CACCTTACTT CTCCCCTOCA TTCATCTTGC AACATTCCCT TACCTCCACA GCCTGT C GCG TCOCCCACCC CTTCAAAACa CTCTTTAACT 
mi TGGGTAAACC CCTCCCACCC CACCATGAGC AACA C CAACA CAGAAGACGC CCTCTOCTAC ATCAAACAaA CGCCTGGTTT AGAGTAGGTA TAACAQACAC cttagcagto gccgtgocaa 
T44I CTCGCTATCA CCTACACAAC ATCACACCTG TCCTCCTGGC ATTCACAACT TTTOCCCACA CCAAAAGACC ATTTCAAGCC ATCACACCOG AAATAAAGCA 1L1LIA CSCT CCTCCTAAAT 
7341 ACTCACCATA GTACATTTCA TCTGACTAAT ACCACAACAC CACCACCATG AATAGAGGAT TCTTTAACAT GCTCCGCCGC CGCCCCTTCC CAGCCCCCAC TCCC A I UIUJ ACCCCCCGOA 
7«S1 GAACCAGCCA CCCCCCCCCO ATGCCTGCCC GCAATCGCCT GGCTTCCGAA ATCCAOCAAC TCACCACACC CCTCAGTGCC CTAGIGaTTQ nACAOGCAAC TACACCTCaA ACCCCACCCC 
7201 CACCCCCGCC CCCCCCCCAC AAGAAGCAGG CCCCAAACCA ACCACCCAAG CCCAACAAAC CAAAAACACA OCACAACAAO AAOAAOCAAC CTCCAAAACC CAAACCCCGA AAflrtGrtCAGC 
T9J1 GTATCGCACT TAAGTTGGAG CCCCACACAC TCTTCGACGT CAAAAATCAC CACCQACATC TCATCGGGCa CGCACTGGCC ATGGAAGGAA AGCTAATGAA aCCACTCCAC GTGAAAGGAA 
60*1 CTATTOACCA CCCTCTGCTA TCAAACCTCA AATTCACCAA GTCCTCAGCA TACCACATCC ACTTCCCACA GTTGCCCCTC AACATGAOAA CTCACCCCTT CACCTACACC ACTGAACACC 

•161 CTCAACCCTT CTACAACTCG caccacccac cgctqcagta tagtcoaogc acatttacca tcccccccgg agtagoaggc <cagoagaca gtggtcctcc oattatccat aactcacocc 
OS1 CCCT7UTCCC CATAGTCCTC CCACCCCCTO ATCACOCAAC aacaaccccc ctttccctco TCACCTCGAA TaOCAAACOO aacacaatca acacaacccc cgaaogoaca gaacagtggt 
MOI CTGCTOCACC ACTGGTCACG GCCATGTCCT TSCTTCGAAA CGTCACU IL CCATSCAATC CCCCCCCCAC ATGCTACACC CCCGAACCAT CC A C a CCTC T CCACATCCTC CAAOAOAACC 
tRl TCAACCACCA CCCCTACCaC ACCCTCCTCA ACCCCATATT CCCCrSCCCA TCCTCCCCCA GAACTAAAAC AACCOTCaCT CACCACTTTA CCTTGACCAG CCCGTACTTO CCCACATCCT 
6541 CGTACTCTCA CCaTACTCAA CCCTOCTTTA CCCCOATTAA CATCCACCAO CTCTQCGATC AACCCGACCA CAACaCCaTA CCCATACACA CTTCCSCCCA ctttcgatac caccaaaoco 
not GACCACCAAC CTCAAATAAG TACCCCTACA rCTCCCTCCA GCAGGATCaT ACTOTCAAAO AACCCACCAT CCATCACATC AAOATCACCA CCTCACCACC GTOTAOAACO CTTAOCTACA 
<UI AACCATACTT tctcctcgcg aagtgtcctc CACCCCACAO CCTAACOGTT ACCATACCOA CTACCAACTC aCCAACCTCA tccacaatco cccocaacat AAAACCAAAA ttcctcocac 
«oi cccaaaaata tcacctacct ccccttcacc gtaacaacat tccttccaca CTCTACCACC CTCTCAAAGA AACAACCCCC GGCTACATCA ctatocacac ccccccaccc cacocctata 
9111 CaTCCTATCT CCaCCAATCA """" * TGGAAAG tttaccccaa cccaccatcc gccaaqaaca TTACCTACCA CTCCAACTOC OOCCaTTACA ACACCCCAAC CCTTACCACC cctacccaaa 
91*1 TCACGGGCTG CACCCCCATC AAGCAGTGCG T C O CC T A TAA CaCCCACCAA ACGAAGTGGG TCTTCAACTC CCCGCACTCO ATCACACACO CCOACCACAC CCCCCAAOCa AAATTOCATT 
9MI TCCCTTTCAA CCTCATCCCG AGTACCTGCA TCGTCCCTGT TGCCCACGCG CCGAACGTAG TACACCCCTT TAAACaCATC ACCCTCCAAT TACACACACA CCATCTCaCA TTOCTCACCA 
948rCCACCACACT ACCCGCAAAC CCOGAACCAA CCACTCAATC CATCATCCCA AACACGGTTA GAAACTTCAC CGTCGACCGA CATCCCCTCO AATACATATG ocgcaatcac caaccagtaa 
9601 CCCTCTATCC CCAACACTCT OCACCAGCAG ACCCTCACOC ATCGCCACAC GAAATACTAC ACCATTACTA TCATCGCCAT CCTCTGTACA CCATCTTAGC CCTCOCATCA U.IUUUUJ 
mi CCATCATCAT TCCCCTAACT CTTCCACCaT TATCTGCCTG TAAACCCCCC CCTCACTCCC TCACCCCATA TDCCCICOCC CCAAATCCCC TCaTTCCAAC TTCCCTSCCA LUUUIU. I 
Wat CTCTTACCTC CCCTAATCCT CAAACATTCA cccacaccat CAOTTACTTA TCCTCCAACA CCCACCCCTT cttctggctc CACCTOTCTA TACCTCTOOC CGCTOTCGTC gttctaatgc 

wsi cam iu.il. atcctgcctg ccttttttac togttccccc cccctacctc CCCAACCTAG acccctacca acatcccacc actcttccaa atgtgccaca gataccgtat aacccacttc 

10031 TTGAAACCCC AOCCTACGCC CCCCTCAATT TCGAGATTAC TGTCATOTCC TCCCACCTTT TCCCTTCCAC CAACCAAGaO TACATTACCT GCAAATTCAC CACTGTGGTC CCCTCCCCTA 
10201 AAOTCACATO CTCCCCCTCC TTGCAATGTC ACCCCCCCCC TCACGCAGaC TATACCTGCA ACCTCTTTCC ACCCOTOTAC CCCTTCATGT CCGCAGGAG C ACAATUTTTT TCCGACAGTC 
tam ACAACACCCA GATGAG7CAG GCGTACGTCG AATTOTCAGT AGATTCCGCG ACTGACCACG CGCAGGCGAT taaggtgcat actgccgcga tgaaagtagg ACTOCCTATA CTCTACGGCA 
10441 ACACTACCAG TTTCCTACAT CTGTACCTGA ACCCACTCAC ACCAGCAA C C TCT A AAGACC TGAAAGTCaT ACCTGGACCA ATTTCACCAT TGTTTACACC ATTCCATCAC AAOCTCCTTA 
10331 TCAATCGCCG CCTCCTCTAC AACTATOACT TTCCCGAATA CGCAGCCATC AAA C CAG G aC CGTTTGGACA CATTCAAGCT ACL ILL I 1 LA CTAGCAAAGA CCTCATCGCC ACCACACACa 
10651 TTAGOCTACT CAAGCCTTCC CCCAACAACG TGCATCTCCC GTACACGCAG GCCGCATCTC GATTCGAGAT GTGOA A AAAC AACTCAGGCC GCCCACTCCA GCAAACCGCC CCTTTTCGCT 
1031 GCAAGATTGC ACTCAATCCG CTTCCAGCGO TCCACTCCTC ATACGGCAAC AULLL ATTT CTATTCACAT CCCCAACCCT CCCTTTATCA GGACATCAOA TCCACCACTO GTCTCAACAO 
10911 TCAAATGTGA TCTCaCTCAG TCCACTTATT CACCGCACTT CGCAGGCATG GCTACCCTCC AGTATCTATC CGACCGCCAA CCACAATCCC ctgtacattc GCATTCGAGC acaccaaccc 

110*1 tccaagagtc cacacttcat GTCCTGGACA AAGCACCGGT qacagtacac ttcaccaccg ccaccccaca gccgaacttc attctatcgc tgtgtcgtaa oaacacaaca tgcaatgcac 
tM6l AATGCAAACC accacctcat catatcgtca gcaccccgca CAAAAATCAC caacaattcc aagccgccat ctcaaaaact tcatggactt ccctctttoc ccttttcgcc gccgcctcct 
nui cgctattaat tatagcactt ATCATTTTTC CTTCCACCAT CaTCCTCaCT ACCACACCAA catcacccct acgccccaat caccccacca gcaaaactcg atctacttcc caogaactca 
1 1401 tctccataat gcatcaccct cctatattag atccccoctt accgcgggca ATATACCAAC accaaaactc cacctatttc ccacgaagcg cactgcataa tcctocgcac tcttcccaaa 
11521 TAATCACTAT attaaccatt tattcacccc accccaaaac tcaatctatt tctgagcaag catcctccat aatgccatcc agcctctcca taacttttta ttatttcttt tattaatcaa 

IIMI CAAAATTTTC TTTTTAACaT ttc 
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Nucleotide Sequence of TR339 

' ATTOCCCCCC TAGTACACAC TA1TCAATCA AACAGCCGAC CAATTCCACT ACCATCACAA TGCACAAGCC ACTACTAAAC CTACACCTAO ACCCCCAGAG TCCCTTTDTC OTGCAACTGC 
.21 AAAAAAGCTT CCCCCAATTT GAGGTACTAG CACAGCAGGT CACTCCAAAT CACCATCCTA ATTSCCACAGC ATTTTCCCAT CTOGCCACTA AACTAATCCA CCTCCACCTT CCTACCACAG 
»t CCACCATCTT CGACATAOOC ACCCCACCCC CTCGTACAAT CTTTTCCCAC CACCAGTATC ATTUR7TCT0 CCCCATCCCT AGTCCACAAG ACCCOCACCO CATOATCAAA TAWCCAOTA 
361 AACIOCCCGA AAAAGCGTGC AACATTACAA ACAAGAACTT OCATBACAAC ATTAACCATC tcccoaccct ACTTGATACO CCGOATCCTO AAACACCATC CCTCTOCm CACAACCATC 
«« TTACCTCCAA CATBCOTCCC CAATATTCCC TCATGCACGA CCTOTATATC AACOCTCCCO CAACTATCTA TCATCAGCCT ATOAAACOCO TCCCCACCCT OTACTOCATT OOCTTCOACA 
*. CCACCCACTT CATGTTC7CG OCTATCGCAO GTTCGTACCC TGCGTACAAC ACCAACTOGO CCCACCAGAA ACTCCTTOAA CCCCOTAACA TCGGACTTO CAOCACAAAO CTTUGTOAAO 
Hi CTACCACAGO AAAATTC7TCO ATAATQAGCA AC AACCACTT CAACCCCOCC TCCCGCOTTT ATTTCTCCCT ACCATCOACA CTTTaTCCAO AACACACACC CAGCTTOCAO AGCTOGCATC 
M I TTCCATCC CT CTTCCACTTO AATOOAAACC ACTCGTACAC TTGCCCCTOT OATACACTOO TQAOTTOCCA AOOCTACOTA GTGAaOAAAA TCACCATCAO TCCCCCCATC 4"??ff rt<lAAA 
»l CCOTCCCATA CCCCCTTACA CACAATACCC ACCCCTTCTT CCTATOCAAA CTTACTCACA CAGTAAAAGG AOAACOOOTA TCGTTCCCTO TOTGCACGTA CATOCGOCC ACCATATCCO 
1031 ATCAGATOAC TGOTATAATO GOCACGGATA TATCACCTGA CGATGCACAA AAACTTCTCC TTCCGCTCAA CCAGCGAATT GTXATTAACG CTAOQACTAA CAGGAACACC AACACCATOC 
1X01 AAAATTACCT TCTOCCGATC ATAGCACAAG GOTTCAGCAA ATGGOCTAAO fiAO C OCAAOO ATOATCTTCA TAACGAOAAA ATOCTOOCTA CTAGACAACO CAAQCTTACO TATOOCTOCT 
1311 TCTCCCCOTT TCGCACTAAG AAAGTACATT CGTTTTATCO CCCACCTCOA ACGCACACCA TCGTAAAAGT CCCAGCCTCT TTTAOCCCTT TTCCCATOTC OTCCCTATCO ACOACCtCTT 
1441 TGCCCATCTC CCTCACGCAG AAATTGAAAC TGGCATTGCA ACCAAACAAO GAGOAAAAAC TCCTGCAGGT CTCOOAGGAA TTAOTCATCO ^r.^ T OL11HU AO CATGCTCACO 
1*1 ACGAAGCCAO AGCGGAGAAG CICCOACAAG CACTTCCACC ATTAGTGCCA GACAAAGOCA TCOAGCCAGC CGCAOAAGTT GTCTOCGAAO TCOAOOCGCT CCAGGCCOAC ATCOCAGCAO 
16SI CATTAOTTGA AACCCCGCCC GCTCACGTAA CGATAATACC TCAACCAAAT GACCGTATGA TCGOACAGTA TATCGTTCTC TCCCCAAACT CTGTGCTOAA GAATSCCAAA CICOCACCAO 

1101 cccacccgct agcagatcaq gttaagatca taacacactc cootaoatca GOAAGGTACC cgctcgaacc atacoacgct aaagtactca tcccaccagg aootgccota ccatcgccac 
m\ aattcctacc actgagtgao agcgccacgt tactctacaa cgaaacacao tttctcaacc gcaaactata ccacattgcc atccatggcc ccgccaaoaa tacaoaaoag gaocaotaca 

2041 AGOTTACAAA OGCAOAGCTT GCAGAAACAG AGTaCOTUTT TCACGTCCaC AACAAGCGTT GCOTTAACAA GGAACAAGCC TCaGCTCTGG T CClULUAi AGAACTOACC aaccctccct 
31« ATCATOACCT AGCTCTOOAC CGACTCAAGA CCCGACCroCCOTCCCGTAC AAGGTCGAAA caataooaot OATAOGCACA cccoggtccg gcaaotcagc tattatcaao tcaactgtca 
2Hl CCCCACGGOA TCTTGTTACC AOCOGAAAGA AAGAAAATTO TCOCOAAATT CAOOCCGACO TGCTAAGACT GAGGOOTATC CAGATTACOT CGAaOACAGT AGATTCCGTT ATOC1CAACC 
3401 GATGCCACAA AGC CGTAGAA GTGCTGTACG TTCACOAAOC GTTCGCCTGC CACGCAGOAG CACTACTTGC CTTGATTGCT ATCGTCAGGC CCCGCAAOAA COTAOTACTA ICCGCAGACC 
1531 CCATOCAATO CGCATTCTTC AACATGATCC AACTAAAGGT ACATnCAAT CACCCTCAAA AACACATATG CACCAACACA TTCTACAAGT ATATCTCCCO OCOTTOCACA CAOCCACTTA 
2641 C AGCTATT CT ATCGACACTG CATTACGATO OAAAGATOAA AACCACOAAC CCOTOCAACA AGAACATTOA AATCOaTATT ACAGOGGCCA CAAAGCCGAA OCCAGOGGAT ATCATCCTCA 
2761 CATOTTTCCG CGGCICGGTT AACCAATTOC AAATCGACTA TCCCGGACAT GAAGTAATCA CACCCGCGGC CTCACAAGGG CTAACCAOAA AAGOAfflUTA T CCCCTCC OG CAAAAAGTCA 
2681 ATCAAAACCC ACTGTACCCG ATCACATCaO AGCATOTOAA CCTT7TTOCTC ACCCGCACTO AGGACAOGCT AGTOTGGAAA ACCTTCCAGG GCGACCCATO GATTAAGCAO CTCACTAACA 
3001 TACCTAAAOO AAACTTPCAO GCTACTATAG aCGACTOGOA AGCTOAACAC AAGOOAATAA TTCCTOCAAT AAACACCCCC ACTCCCCOTO CCAaKXCTT CAOCTOCAAO ACCAACOTTT 

3121 GCTOGGCOAA AGCATTCCAA ccgatactao ccac ogc c go TATCOTACTT ACCOOTTCCC AOTGGAGCGA actgttccca cagtttgcgg atcacaaacc acattcgocc atttacgcct 
3241 tagacgtaat ttgcattaag tttttcgcca tgcacttcac aagcggactg ttttctaaac aoaccatccc actaacgtac cat c cc gc c g attcagccag gccgctacct cattcgoaca 
3361 ACAOCCCAGG AACCCC CAAO tatoogtaco atcacgccat tgccgccgaa ctctcccgta gatttccggt cttccaccta gctgooaaco gcacacaacttoattoag acooooagaa 
3461 CCAGAGTTAT ctctccacag cataacctgg tcccggtoaa ccgcaatctt ccicacgccttactccccga OTACAAGCAG AAGCAACCCG ccccgctcga aaaattcttg aaccagttca 
3601 AACACCACTC AGTACnOTG GTATCAGAGO aaaaaattca agciccccot aaoaoaatcg aatgoatcgc cccgattggc atacccocto caoataaoaa ctacaaccto octttcggct 
3711 TTCCOCCGCA GGCACGCTAC gacctgotgt tcatcaacat tgcaactaaa tacagaaacc accactttca gcagtccgaa gaccatgcgg cgaccttaaa aaccctttco cgttcgcccc 
3M1 TGAATTGCCT taacccacga ggcaccctcc tcgtgaactc ctatggctac gccgaccgca acagtcagga cotagtcacc gcicttgcca gaaagtttgt cagggtgtcc ocagcoagac 

3»l CACATTCTGT CTCAACCAAT ACACAAATCT ACC1GATTTT CCCACAACTA nACAACAG C C GTACACGGCA ATTCACCCCG CACCATCTCA ATTGCOTGAT TPCCTCCGTO TATCAGQOTA 
*03 1 CAAOAGATGO ACTTCGAOCC CCGCCGTCAT ACCGCACCAA AAGGGAGAAT ATTOCTGACT OTCAAGAGGA AGCAGTTCTC AACCCACCCA ATCCGCTGGO TACACCAGOC CAAGGAGTCT 
«0I GCCGTGCCAT CTATAAACGT TGGCCGACCA GTTTTACCGA TTCAGCCACG GAGACAGGCA CCGCAAGAAT GACTGTGTCC CTAGGAAACA AACTGATCCA CGCGGTCGGC CCTOATTTCC 

-111 GGAAGCACCC agaagcagaa gccttgaaat tcctacaaaa CGCCTACCAT gcagtgocag acttactaaa tcaacataac atcaagtctc tcgccattcc actoctatct acaogcattt 
4441 ACGCAGCCGC aaaagacccc cttgaagtat cacttaactc cttgacaacc gcgctagaca gaactcacgc ccacctaacc atctattgcc tggataagaa gtcgaaggaa agaatccacg 
4541 CCGCACTCCA acttaaggao tctctaacag agctgaagca tcaagatatg gaoatcgacg atgagttagt atcgatccat ccagacactt GCTTGAAGGG AAGAAAGGCA ttcagtacta 

•681 CAAAACCAAA ATTCTATTCG TACTTCGAAG GCACCAAATT CCATCAAGCA CCAAAAGaCA TGGCCGAGAT AAAGGTCCTO TTCCCTAATG ACCAGCAAAG TAATGAACAA CTGTCTCCCT ' 
«0I ACATATTOGO TCAGACCATC GAAGCAATCC GCCAAAAGTC CCCGCTCCaC CATAACCCGT CCTCTACCCC GCCCAAAACG TTGCCCTCCC TTTCCATCTA TCCCATCACO CCACAAACGG 
4921 TCCACAGACT TAGAAGCAAT AACGTCAAAC AAGTTACAGT ATCCTCCTCC ACCCCCCTTC CTAAGCACAA AATTAAGAAT GTTCAGAAGG TTCAGTCCAC GAAAGTAGTC CTCTTTAATC 
5WI CGCACACTCC CGCATTCGTT CCCGCCCCTA AGTACATAGA AGTGCCAGAA CAGCCTACCG CTCCTCCTCC ACAGGCCCAG CAGGCCCCCG AAGTTGTAGC CaCaCCGTCA CCATCTACAC 
5161 CTCATAACAC CTCGCTTGAT gtcacagaca TCTCACTCGA TATCCATCAC agtagcgaao gctcactttt ttcgagcttt agcggatcgg ACAACTCTAT TACTACTATO GACACTTCGT 
SMI CGTCAGGACC TACTTCACTA GAOATACTAO ACCGAAGGCA CG7CGTGCTG GCTGACGTTC ATCCCOTCCA AGAGCCTGCC CCTATTCCAC CCCCAACGCt/aaaGAAGATO GCCCGCCTGG 
5401 CAGCCGCAAO AAAAGACCCC ACTCCACCGO CAAGCAATAG CTCTCACTCC CTCCACCTCT U1HLUU, CCTATCCATG TXCCtCCCAT CAATTTTCGA CCCAOAGACO CCCCGCCAGG 
5521 CAGCGCTACA ACCCCTCCCA aCaGGCCCCA CGCATOTCCC TaTCTCTTTC CGATCCTTTT CCGACGGAGA CATTGATGAG CTGAGCCCCA GACTAACTCA OTCCOAACCC CTCCTCTTTO 
5641 GATCATTTOA ACCGCCCCAA GTOAACTCAA TTATATCCTC CCCATCAGCC CTAIUIIIL CACTACCCAA GCaCaCACCT ACACGCAGCA GCAGOAGGAC TCAATACTGA CTAACCGCCG 
5T61 TAGGTCGGTA CATATTTTCC ACGGACACAG GCCCTCCCCA CTTGCAAAAG aactcccttc tgcagaacca ccttacagaa CCCACCTTCO AGCGCAATCT CCTGGAAACA attcatoccc 
5881 CGOTGCTCGA CACCTCGAAA CAGGAACAAC TCAAACTCAC GTACCAGATO ATCCCCaCCG AAGCCAACAA AACTACCTAC CACTCTCGTa AACTACAAAA TCACAAAG C C ATAACCACTC 
*00l ACCGACTACr CTCAGGACTA COACTCTaTA ACTCTCCCAC AGATCAGCCA GAATGCTATA AGATCACCTA TCCGAAACCA TTCTACTCCA CTACCOTACC CCCCAACTAC TCCGATCCAC 
6111 ACTTCCCTCT ACCTCTCTGT AACAACTATC TCCATOAGAA CTATCCGACA GTAGCATCTT ATCACaTTAC TGaCGaCTAC GATGCTTACT TCCATATGCT AOACGGGACA CTCCCCTGCC 
61*1 TCCATACTCC AACCTTCTCC CCCGCTAAGC TTAGAAOTTA CC C OAAAAAA CATGAGTATA OAGCCCCGAA TATCCGCAGT OCGCTTCCAT CAGCOATCCA GAACACCCTA CAAAATUTGC 
6361 TCATTCCCCC AACTAAAAGA AATTCCAACG TCACGCAGAT OCGTGAACTO CCAACACTCG ACTCACCCAC ATTCAATGTC CAATCCTTTC GAAAATATGC ATGTAATCAC CAGTATTGCG 
6*81 ACGAGTTCCC TCCCAAGCCA ATTACGATTA CCaCTGACTT TCTCACCCCA TATCTAGCTA GACTGAAACG CCCTAACGCC GCCCCaCTAT TTGCAAAGAC CTATAATTTO CtCCCATTCC 
«0I AAGAAGTCCC TATGCATACA TTCCTCATGG ACATGAAAAO ACACGTCAAA CTTACACCAO GCACGAAACA CACAGAAGAA aGaCCGAAAG TACAACTCAT ACAAGCCGCA GAACCCCTCG 
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«ll CGACTGCTTA CTTATOCCCO ATTCACCCOC AATTAOTGCO TAGOCTTACO CCCCTCTTOC TTCCAAACAT TCACACGCTT TTTOACATOT COOCCCACCA I 1 1 1U ATOCA ATCATAGCAG 
mi AACACTTCAA GCAAGG CG A C CCCCTACTGO AGACCGATAT CGCATCATTC OACAAAAGCC AAGACCA C OC TATOOCCTTA aCC UUILILA TO AlHIUi A GCACCTOOGT OTOOATCAAC 
09*1 CACTACTCCA CTTGATCCAG TOCCCCTTTG GAGAAATATC ATCCACCCAT CTACCTACCC GTA HLUIM TAAATTCOCC GCGATGATOA AATCCGGAAT UULLILA CA CTTTTTOTCA 
TWI ACACAOTTTT GAATGTCGTT ATCOCCACCA CACTACTACA AGAOCGCCTT AAAACOTCCA GaTGTDCAGC OTTCATTGGC GACGACAACA TCATACATGO AQTAOTATCT GACAAAGAAA 
TU1 IGOUU ACAO O t CCCCC ACC TGCCTCAACA TCGAOCTTAA CATCATCOAC CCACTCATCO OTGAGAQACC ACCTTACTTC TOCOOCOGAT TTATCTTGCA AOATTCOOTT ACTTCCACAa 
TJ2I CGTGCCGCGT CO C OOACCCC CTCAAAAGCC TCTTTAACTT GCGTAAAOCO CICCCAOCCO ACCA C CACCA AGACCAACAC ACAAOACOCQ CT CIOH AQA TGAAACAAAG GCGTOOTTTA 
7441 CACTACCTAT AACAOCCACT TTACCACTCO CCOTGACGAC CCOGTATCAO CTAGACAATA TTACA U.1L I CCTACTCOCA TTGAGAACTT TTOCCCAOAO fAftAAQAGCA TTCCAAGCCA 
TS6I TCAGAGGGGA AATAAAGCAT CTCTACGOTO GTCCTAAATA GTCAOCATAG TACATTTCAT CTGACTAATA CTACAACACC ACGACCATGA ATAGAGGATT CTTTAACATO CTCGGCOGCC 
7611 OCCCCTTCCC GGCCCCCACT GCCATGTGGA GGCCGCGGAO AAGGAGOCAO GCGGCCCOQA 1LCC1UXLU CAA LULU.IU GCTTCTCAAA TCCAGCAACT «*rCACAPPC GTCAGTCCCC 
7101 TAGTCATTCG ACACOCAACT ACACCTCAAC CCCCACCTCC ACGCCCGCCA CCGCGCCAOA AGAAOCAOCC GCCCAAGCAA CCACCGAAGC CGAAGAAA CC AAAAACGCAO gAGAAHAAGA 
7721 AGAAGCAACC TCCAAAACCC AAACOCGGAA AGAOA C AO C O CATOCCACTT AAGTTGGAGG CCGACAOATT CTTCOACCTC AAGAACGAGG ACGGAGATOT catcgggcac gcactgocca 
•Ml TGGAAGGAAA GGTAATGAAA CCTCTGCACO TGAAAGGAAC CATCGACCAC CCTGTGCTAT CAAACCTCAA ATTTACCAAG TCGTCAGCAT ACGACATGOA GTTCOCACAG TTOCCAGTCA 
llftl ACATGAGAAG TGAGGCATTC ACCTACACCA OTOAACACCC CGAAG GAT TC TATAACTOGC ACCACGCAGC GGTCCACTAT ACTCGAGGTA QATTTACCAT CCCTCGCGGA OTAGGAGGCA 
CSt OACOAGACAO CGGTCCTCCO ATCATGGATA ACTCCCGTCG C0TTGTCGCO ATAOTCCTCG OTOGAOCTGA TGAAGGAACA CCAACTGCCC TTTCOOTCGT CACCTOGAAT AGTAAAGGGA 
WH AGACAATTAA OACGACCCCa GAAGGGACAO AAGAGTOGTC CGCAOCACCA CTCGTCACGG CAATGTOTTT OCTCGGAAAT OTGAGCTTCC CATGCGACCO CCC O C CCACA TOCTATACCC 
U21 OCOAAIU1L C AOA U. lt I C GACATCCTTO AAGAGAACGT OAACCATOAG GCCTACGATA CCCTGCTCAA TCCCATATTO CGGTGCGGAT CCTCTGGCAO AAGCAAAAGA AGCGTCACTG 
4641 acgactttac CCTGACCAGC CCCTACTTCO GCACATGCTC GTACTGCCAC CATACTGAAC CGTOCTTCAG CCCTOTTAAG ATCGAGCAGG tctoogacga aocgoacgat AACACCATAC 

mi GCATaCAGAC ttccocccag tttggatacc accaaagcgo accacca a oc gcaaacaagt accoctacat gtcgcttoao caggatcaca cccttaaaga agccaccato oatgacatca 
iui agattagcac ctcacgaccg tgtagaaggc ttagctacaa aggatacttt ctcctcocaa aatgccctcc agogoacagc gtaacggtta gcatagtoao tagcaactca gcaacctcat 
9001 OTACACTCCC CCCCAACATA aaaccaaaa t tcgtccgaco ooaaaaatat gatctacctc ccgttcacog taaaaaaatt ccttccacao tbtacgacco tctgaaaoaa acaactgcag 
9121 CCTACATCAC TATGCACAGG ccoogaccgc acgcttatac atcctaccto GAAGAATCAT CAGOGAAAGT TTACGCAAAO cc oc c atcto ggaagaacat tacqtatgag tgcaaotgco 
0241 GCOACTACAA ga c cooaacc gtttcgaccc gcao c gaaat cactggttcc accgccatca agcagtgcct cgcctataag agcoaccaaa cgaaotoggt cttcaactca ccgcacttca 
9J4I tcagacatga cgaccacacg qcccaagoga aattgcattt occtttcaag ttg a t cccg a gtacctdcat cctccctutt gcccacgcgc cgaatotaat acatoocttt aaacacatca 
Mil OCCTCCAATT AGATACAGAC CACTTGACAT TGCTCACCAC CAGOAGACTA GGCGCAAACC cggaaccaac cactgaatgo atcotcggaa agacogtcag aaacttcacc otcgaccgag 
moi atcgcctcga atacatatog GGAAATCATG agccactgag cctctatocc caaoagtcao caccaogaga ccctcacgga tcgccacacg aaatagtaca gcattactac catcgccatc 
9721 CTGTCTACAC catcttagcc otcgcatcag ctaccotggc gatgatgatt cccgtaaccg ttgcaotott atotccctgt aaagcgcccc gtgactocct oacgccatac gccctooccc 

9MI CAAACGCCGT AATCCCAACT TCGCTOGCAC ILI1UIU.IU COTTAOGTCO GCCAATCCTO AAACOTTCAC CGAOACCATO AOTTACTTOT GGTOOAACAG TCAGCCGTTC MULOLILL 

9M1 AGTTGTGCAT acctttggcc gctttcatcg ttctaatgco C I 0CIU.IIL T GC TO CC I U. lim iii a gt ggttcccggc gcctacctoo cgaaggtaga cccctaccaa catoccacca 
MOII ctottccaaa totoccacao ataccgtata acgcacttot toaaacggca ggotatgccc cgctcaattt ogagatcact otcatgtcct cogaootttt occttccacc aaccaagaot 

tam ACATTACCTG CAAATTCACC ACTGTGCTCC CCTCCCCAAA AATCAAATGC IIXUUHLH ' TGGAATGTCA GOGGGCCGCT CATCCAGACT ATACCTGCAA GOTCTTCOOA OOGGTCTACC 
10321 CCTTTATGTO OOOAOQAOCO CAATOTTTTT GCGACAGTGA GAACAO C CAO ATOAGTOACG cgtacgtcga actotcagca cattgcgcgt CTGACCACGC ocagocoatt aaogtccaca 
10441 CTOCCOCGAT GAAAGTAGGA CTGCOTATAC TGTACCGGAA CACTACCAGT TTCCTAGATG TGTACGTGAA CGGAOTCACA CC AGQAACCT CTAAAGACTT GAAAGTCATA GCTOGACCAA 
MM1 TTTCAGCATC OTTTACGCCA TTCGATCATA AGGTCOTTAT CC A T C CC GOC CTOGTOTACA ACTATGACTT CCCGGAATAT OGAGCOATGA AACCAGGAfiC OTTTOOAOAC attcaagcta 
106U U- ILL 11 LAC TAGCAAGGAT ctcatcgcca gcacagacat taggctactc aagccttccc ccaagaacgt gcatctcccc tacacgcago ccocatcagg atttoagato TGOAAAAACA 
(0*81 ACTCAGGCCG CCCACTCCAG GAAACCGCAC LU1LUAJ1U taagattgca gtaaatccgc tcccagcgot OGACTGTTCA tacgggaaca ttcccatttc tattcacatc ccgaacccto 
lorn cctttatcag gacatcagat ocaccactcg tctcaacact caaatotoaa gtcagtgagt gcacttattc agcagacttc ggcgggatgo ccaccctcca gtatotatoc GACCCCGAAG 

11041 GTCAATGCCC CGTACATTCO CATTCGAGCA CACCAACTCT CCAAGAGTCG ACAOTACATO TCCTGGAGAA ACGAO CG OTO ACAGTACACT TTAGCACCCC GAGTCCACAO GCCAACTTTA 
I till TCCTATCGCT GTGTGGGAAO AACaCAACAT GCAATGCaGA ATGTAAACCA CCAGCTGACC ATATCGTGAG CACCCCCCAC AAAAATGACC AAGAATTTCA AGCCGCCATC TCAAAAACAT 
11211 CATGGAGTTO LLIL I 1ILC C CTTTTCGGCG GCGCCTCGTC GCTATTAATT ATAGGACTTA TCATTTTTGC TTGCAGCATG ATOCTOACTA GCACa C GAAO ATGACCGCTA CGCCCCAATG 
11401 ATCCGACCAO CAAAACTCGA TGTACTTCCG AGGAACTGAT GTGCATAATG CATCAGGCTO GTACATTAGA TCCCCGCTTA CCGCCCGCAA TATACCAACA CTAAAAACrC CATUTACTTC 
11521 CGAGGAAGCG CAGTGCATAA TGCTGCGCAG TCTTCCCACA TAACCACTAT ATTAACCATT TATCTAGCGG ACGCCAAAAA CTCAATGTAT TTCTGAGGAA GCGTCGTGCA TAATGCCACG 
11441 CAGCGTCTGC ATAACTTTTA TTATTTCTTT TATTAATCAA CAAAATTTTG TTTTTAACAT TTC 
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